Dieses Kapitel untersucht die formale Verifizierung der Cache-Färbung im Bao-Hypervisor, einem entscheidenden Prozess zur Gewährleistung einer sicheren und effizienten Ressourcenallokation in eingebetteten Systemen. Cache-Färbung ist von entscheidender Bedeutung für die Aufrechterhaltung der Speicherisolation und die Verhinderung von Side-Channel-Angriffen, insbesondere in Systemen mit mehreren virtuellen Maschinen (VMs), die Hardware-Ressourcen gemeinsam nutzen. Der Bao-Hypervisor implementiert einen ausgeklügelten Cache-Färbemechanismus, der Speicher basierend auf Cache-Sets segmentiert und sicherstellt, dass VMs nicht um dieselben Cache-Ressourcen konkurrieren. Das Kapitel geht näher auf die Implementierungsdetails der Cache-Färbung in Bao ein und beleuchtet die Herausforderungen, die von Bit-Level-Operationen, komplexer Arithmetik und verschachtelten Schleifen ausgehen. Es bietet einen umfassenden Überblick über den Verifikationsprozess auf der Frama-C-Plattform, einschließlich der Identifizierung und Behebung subtiler Fehler in der bestehenden Implementierung. Die Autoren präsentieren einen detaillierten Spezifikations- und Verifikationsansatz, der die Verwendung von Prädikaten, Geistercodes und Schleifeninvarianten demonstriert, um die Korrektheit des Cache-Färbungsmechanismus sicherzustellen. Das Kapitel diskutiert auch potenzielle Optimierungen und die umfassenderen Implikationen der formalen Verifikation für kritische eingebettete Systeme, was es zu einer wertvollen Ressource für diejenigen macht, die sich für fortgeschrittene Softwareentwicklung und Sicherheit interessieren.
KI-Generiert
Diese Zusammenfassung des Fachinhalts wurde mit Hilfe von KI generiert.
Abstract
Hypervisors allow sharing of computing resources between applications—possibly of various levels of criticality—that makes them increasingly relevant for modern embedded systems. In this context, memory isolation properties (including low-level cache isolation) are essential to guarantee. This paper presents a case study on formal verification of the cache coloring mechanism implemented in the Bao hypervisor. It proposes an original technique for coloring memory pages and assigning to each virtual machine only pages of certain colors, aimed to provide strong isolation guarantees. The implementation presents several challenges for formal verification, such as bit-level operations, complex arithmetic operations, multiple levels of nested loops, and linked lists. We identify two subtle bugs in the existing implementation breaking the expected guarantees, and propose bug fixes. We provide formal specification for the key functions of the mechanism and verify their (fixed) version in the Frama-C verification platform with a few lemmas proved in the Coq proof assistant. We present our specification choices, verification approach and obtained results. Finally, we outline possible optimizations of the current implementation.
1 Introduction
Hypervisors allow a host system to support multiple guest systems (virtual machines, or VMs) by virtually sharing its resources, such as memory and processing. Already intensively used in some domains (e.g. cloud infrastructures), hypervisors become highly relevant today for critical embedded systems due to an increasing number of necessary functions and features. Numerous functions have already been added to embedded systems, such as driver assistance or sensor management, and more functions need to be integrated today, for example, artificial intelligence (AI) solutions for mission-critical systems, or further entertainment and connectivity features. In many contexts, it is not possible to add more hardware because of size, weight and cost constraints. To enable this integration, it is necessary to share the same hardware between several functions (or systems), often with different levels of criticality. It can be achieved thanks to virtualization, when each system runs on a separate VM.
Hardware resources can be shared by the hypervisor in two ways: (i) time sharing: each VM has access to all resources in turn, i.e. VMs are scheduled; (ii) partitioning: each VM has access only to the part of resources dedicated to it. Time sharing requires a more complex and resource-hungry hypervisor, due to the scheduling function. That is why partitioning-based hypervisors (called static hypervisors) are more widely used in embedded systems. Static hypervisors allocate all hardware resources to VMs during the hypervisor start-up, so that each resource is allocated to only one VM. In addition, each VM has direct access to its resources, without interception by the hypervisor, which is particularly important for real-time systems. Thus, a static hypervisor seems to be an ideal solution for mixed-criticality systems.
Anzeige
However, some resources must be shared, such as processor last-level cache (LLC), which is by definition shared between several cores, each one possibly running a different VM. To tackle this problem, some static hypervisors implement cache coloring. The main idea is to split cache—without specific hardware—into several areas, each associated with a color. A color can then be associated with a VM, so that the data of memory pages used by this VM can be stored only in the cache area of the same color. The underlying implementation becomes more complex and highly critical, and its correctness is essential to guarantee.
The purpose of this work is formal verification of the cache coloring mechanism implemented in Bao [1, 37], an open-source static hypervisor used in embedded systems. While it proposes an elegant optimized implementation, its code is also challenging for formal verification because it contains non-trivial logic, bit-level operations, complex arithmetic operations, multiple levels of nested loops, and linked lists. During this case study, we identified two subtle bugs1 in the existing implementation breaking the expected guarantees, and proposed bug fixes2. We provide formal specification for the key functions of the mechanism and verify their (fixed) version in the Frama-C verification platform [10, 31, 33]. The proof requires carefully chosen predicates, ghost code, non-trivial loop invariants and lemmas. Some proof goals are not proved by automatic solvers: we prove them interactively (in Frama-C or in the Coq proof assistant [13, 40]). We present our specification choices, verification approach and obtained results, and outline possible further optimizations of the current code.
Contributions. The contributions of this work include:
a pedagogical presentation of the cache coloring mechanism of Bao;
an identification of some subtle bugs in its implementation and proposals of bug fixes, as well as suggestions of possible further optimizations;
formal specification and verification of a subset of (fixed) real-life code of this mechanism in Frama-C, publicly available via a companion artifact [27];
an overview of key specification choices, verification solutions and results.
In a broader sense, this work promotes rigorous software engineering approaches, contributes to an empirical evaluation of modern verification tools, and enriches the record of successful formal verification case studies for critical real-life code in industrially relevant contexts.
Anzeige
Outline. Section 2 presents Frama-C. Bao and cache coloring are described in Sect. 3. The considered implementation is presented in Sect. 4. Section 5 presents the bugs, suggested fixes and optimizations. Section 6 describes key specification choices, verification solutions and results. Finally, Sect. 7 provides some related work and concludes the paper.
2 Frama-C Verification Platform
Frama-C [10, 31, 33] is an open-source verification platform for C code. It offers various plugins along with a kernel providing basic services for source-code parsing and analysis. The program under analysis can be annotated in ACSL (ANSI/ISO C Specification Language) [11, 33], a formal specification language for C, that allows users to express functional properties of programs in the form of annotations, such as assertions or function contracts, written in special comments
and
. A function contract includes pre- and postconditions (resp., requires and ensures clauses) expressing properties that must hold, resp., before and after a call to the function. It also includes an assigns clause listing (non-local) variables and memory locations that the function is allowed to modify. The terminates\({\backslash }\)true clause specifies that the function must terminate. Users can add ghost code, used only for verification purposes and written in annotations
. Ghost code can also contain annotations, written in special comments /@...@/ and
. ACSL offers built-in predicates and logic functions to express frequent properties such as pointer validity or memory separation, and provides different ways to define new predicates and logic functions. As it is often done, in this document some ACSL notation (e.g.
,
,
,
,
) is pretty-printed (resp., as \(\forall \), \(\mathbb {Z}\), \(\Rightarrow \),\(\le \), \(\ne \)).
Frama-C offers a deductive verification plugin called Wp [33] . Given a C program annotated in ACSL, Wp generates the corresponding proof obligations (also called proof goals or verification conditions) that can be proved either by Wp itself, or (through the Why3 platform [28]) by SMT solvers [9, 14, 20] or an interactive proof assistant like Coq [13, 40]. To ensure the absence of runtime errors (RTE), Wp can automatically add necessary assertions and try to prove them as well. In this work, we chose to use Frama-C/Wp due to its capacity to perform deductive verification of industrial C code with successful verification case studies [23] and the fact that it is currently the only tool for C source code verification recognized by ANSSI, the French Common Criteria certification body, as an acceptable formal verification technique for the highest certification levels EAL6–EAL7 [24].
3 The Bao Hypervisor and Cache Coloring
The cache issue. Caches in modern CPUs are organized in levels: each core has a first-level cache, and the data in these caches are replicated in (possibly several levels of) higher-level caches until last-level cache (LLC), shared by all cores. While data from a given page is always stored in the same cache area, the memory-to-cache mapping is not bijective: data from different memory addresses can end up being stored in the same cache area. This reduces isolation guarantees and can potentially increase the risk of (e.g. side-channel) attacks. Moreover, memory addresses mapped to the same cache set compete for space. If a VM running on one core frequently accesses a large amount of data, it can monopolize the shared cache, slowing down other VMs running on nearby cores. This is a serious issue for real-time applications. To prevent this, the hypervisor must ensure that memory pages assigned to different VMs do not overlap in cache.
Cache coloring. Cache coloring is a technique that assigns colors to memory pages such that pages of the same color compete for the same cache sets, while pages of different colors do not compete for the same cache sets. In essence, cache coloring segments the main memory based on cache segmentation. The minimum number of colors is one (i.e., no cache coloring), and the maximum is determined by the number of cache sets of the different caches.
When a hypervisor assigns a unique color to each virtual machine—meaning a VM is loaded exclusively on pages of its color that have not been allocated to other VMs—it ensures that: (i) VMs cannot access each other’s data since they reside on separate pages in memory; (ii) VMs do not compete for the same cache sets because their data is stored in cache sets of different colors. Thereby, cache coloring is essential for memory isolation.
Bao. Bao [1, 37] is a lightweight open-source static hypervisor specifically designed for embedded systems and real-time applications. It focuses on providing strong isolation between VMs and ensuring real-time guarantees, being thus particularly well-suited for environments where both performance and reliability are critical. Bao elegantly implements a general version of cache coloring where the uniqueness property can be relaxed, that is, each VM accepts a subset of colors. It is crucial to ensure correctness of this implementation, that makes it a highly relevant target for formal verification.
4 Implementation of Cache Coloring in Bao
This section presents an overview of the cache coloring mechanism in the Bao hypervisor (see the real-life code in [1]), and a simplified version of its key functions (given in Fig. 3). Several syntactical changes were realized to make the real-life code more compact and clearer for the paper. The only semantical change is the removal of lock and unlock instructions (in the beginning and the end of function pp_next_clr) used to prevent concurrent modifications of a page pool and page allocation statuses, which are classic and orthogonal to our main scope. The semantics of other instructions (with all real-life code optimizations and bit-level operations) was carefully preserved.
Fig. 1.
The main memory layout in Bao with cache coloring, where the periodic block of colors is repeated to engender the coloring of pages for the whole memory.
4.1 Overview of the Implementation
When the option to use cache coloring is activated, Bao calculates during the boot the number of colors allowed by the hardware. Then, it attributes to every page a single color depending on the page number, so that its data is mapped to a cache area of the same color. COLOR_SIZE denotes the number of contiguous pages of the same color in memory, while COLOR_NUM represents the number of different colors in memory. In other words, pages are colored into the same color by consecutive groups of COLOR_SIZE pages, and the colors of the groups follow a constant sequence that loops every COLOR_NUM groups. Thus, the main memory is colored following a specific pattern—a periodic block of COLOR_NUM*COLOR_SIZE pages—as illustrated in Fig. 1 (where both constants are equal to 4, so the block contains 16 pages).
In a configuration file, for each VM, the user specifies a set of (possibly several) acceptable colors for pages where the VM will be loaded. When loading a VM, Bao maps the VM’s address space into free pages of acceptable colors.
Fig. 2.
Example of a pool of memory pages where COLOR_SIZE equals to 1 and COLOR_NUM equals to 2.
The allocation of suitably-colored pages is handled by function pp_alloc_clr (detailed below). It searches for a set \(\{p_1,\dots ,p_n\}\) of a required number n of free consecutive pages of acceptable colors \(c_1,\dots ,c_k\). To formalize these requirements for selected pages, it is convenient to introduce the notion of a pset (pronounced as p-set).
A set of pages \(\{p_1,\dots ,p_n\}\) is called a pset of n pages for acceptable colors \(c_1,\dots ,c_k\) if: (i) each page \(p_i\) in the set has one of the acceptable colors \(c_1,\dots ,c_k\); (ii) the pages of the set are (colorwise) consecutive, that is, there does not exist a non-selected page of an acceptable color between two selected ones (notice that there may exist a non-selected page between two selected pages if its color is not acceptable). We say that the pset is free if in addition: (iii) each page \(p_i\) in the set is free. We say that the pset is in (or inside) a pool of pages if: (iv) each page \(p_i\) belongs to the pool.
In this terminology, function pp_alloc_clr searches for a free pset of a given sizenfor given acceptable colors\(c_1,\dots ,c_k\)inside a given pool of pages. When the conditions are clear from the context, we may drop them and just say “a (free) pset”.
For example, in the pool of 8 pages shown in Fig. 2, the set \(\{p_2,p_4,p_6\}\) forms a free pset of size 3 for the yellow color; the set \(\{p_1,p_3\}\) does not form a free pset of size 2 for the blue color (since \(p_3\) is not free); while \(\{p_1,p_5\}\) is not a pset of size 2 for the blue color, as (ii) fails (page \(p_3\) is in-between).
Bao developers chose to search only for consecutive pages because it simplifies the process for other functions to access the newly allocated pages: from the starting page of a pset of size n, function pp_next_clr is iteratively called n times to obtain the first page of an acceptable color (that should return the starting page itself), then the second page of an acceptable color, and so forth (as it will be shown on lines 62–65 in Fig. 3).
Functions bitmap_get and bitmap_set are used, resp., to read and to write the allocation status of a page (allocated if nonzero or free if zero) from a bitmap, in which each bit represents the status of a page.
Fig. 3.
Simplified code of the cache coloring mechanism in Bao.
4.2 Basic Type Definitions and Constants
Lines 2–22 of Fig. 3 define basic types and constants used in the code. P_SIZE denotes the size of a memory page (in bytes). COLOR_NUM and COLOR_SIZE were presented above. CELL_SIZE defines the number of bits in an array cell of type u32, which will be used for a compact storage of bits in a bitmap (defined as an array of type u32*).
The page_pool structure (lines 5–11) represents a pool of pages, that is, a contiguous memory area, starting at the page address base and containing size pages. Each page is marked as free or allocated using the corresponding bit in bitmap. For heuristic purposes, the field last records the page that follows the last page of the last allocated pset. Field node is used (in higher-level functions) to link several pools into a linked list. Some other fields unrelated to the scope of this work were removed in this paper for simplicity (but this simplification does not impact the proof results).
A set of colors is encoded as a 64-bit unsigned integer, called a vector of colors, in which the i-th bit is set if the i-th color is authorized. The ppages structure (lines 12–16) is used to store a pset, described by the first page’s address base, the number of pages num_pages and the vector of acceptable colors colors.
4.3 Implementation of pp_next_clr
Function pp_next_clr (see Fig. 3, lines 23–29) looks for a first suitably-colored page starting from a given page. This function takes as arguments the address of a base page base, an offset from (in terms of page numbers with respect to the base page) of the starting page of the search, and a color vector colors indicating the acceptable colors. It returns the offset (again, in terms of page numbers with respect to the base page) of the first page whose color is one of the acceptable colors specified in the color vector. Notice that while the base page base is given by its address, the starting page and the returned page are identified by their page number offsets (with respect to the number of the base page) and not their address offsets. The page number of the base page with address base is base / P_SIZE, while the starting page defined by the number offset from has page number base / P_SIZE + from and address base + from * P_SIZE.
Calculation of the color of a page. In Bao, the cache coloring mechanism defines the color of a page with page number PNum through the formula:
PNum / COLOR_SIZE % COLOR_NUM. (\(\textrm{A}\))
The function keeps track of the offset of the current candidate page in the variable index, initialized to from, see line 25. The page number of the current page is base / P_SIZE + index and its color is naturally calculated as:
where clr_offset is the (page number) offset of the base page with respect to the beginning of its periodic block of colored pages, defined on line 24.
Going through the pages. To find the next suitably-colored page, the function iterates over the pages (through the loop on lines 26–27), starting from the index from (line 25), each time checking if the color is acceptable using a bit shift of the color vector. The color c is acceptable if and only if \( \texttt {!((colors >> c) \& 1)}\) (line 26). Once a page with an acceptable color is found, the loop condition fails, and the function returns the offset of the found page on line 28.
Callers always check that the color vector colors contains (hence, accepts) at least one existing color, which ensures the termination of the loop as it will eventually find a page of an acceptable color. Notice that the function does not guarantee that the returned page belongs to the valid range of page indices; this verification is supposed to be done in the upper-level functions.
4.4 Implementation of bitmap_get and bitmap_set
Function bitmap_get (see Fig. 3, lines 31–33) checks the allocation status of pages encoded by a bitmap. It takes two arguments: a bitmap map and a bit number bit, and returns 1 if the bit-th bit is set in map, and 0 otherwise. The bit-th bit is contained in the cell of index bit / CELL_SIZE, at offset bit % CELL_SIZE. This explains the calculation on line 32. Similarly, function bitmap_set (line 35–37) updates the allocation status of a page.
4.5 Implementation of pp_alloc_clr
Function pp_alloc_clr (see Fig. 3, lines 39–76) searches for a given number of free consecutive pages of acceptable colors in a given page pool, that is, a free pset. The function takes four arguments: a pointer to a pool structure pool, the number of pages n, a vector colors of acceptable colors, and a pointer to a physical page structure ppages to store the result of the search. In case of success, it returns (inside the structure) the number of pages n and the address of the first page; otherwise, it sets the number of pages to 0.
The variable index contains the number offset of the current candidate page (with respect to the base page of the pool). The search proceeds in two phases performed by two iterations of the loop on lines 47–74. During the first phase (i==0), it starts by searching the first page of an acceptable color from the page with number offset last (cf. line 45). Recall that last stores the page that follows the last page of the last pset found by the function. The intuition behind this heuristic is that starting from the last page is on average more efficient than starting always from the beginning of the pool, because after several allocations the pages in the beginning of the pool will be more likely to be already allocated. To be exhaustive, the second phase (i==1) starts from the beginning (line 71).
In each phase, the loop on lines 48–58 scans for free psets. It stops either when it finds a free pset of size n of acceptable colors or when the current candidate page runs outside the pool (cf. lines 46, 48).
To find such a pset, the function first searches for the first free page of an acceptable color, as shown in the loop on lines 50–51. If the candidate page has already been allocated (line 50), the function moves to the next candidate page of an acceptable color (line 51). The loop continues until it finds a free page of an acceptable color or the current candidate page runs outside the pool.
If the first page is within the pool and free, the loop on lines 53–56 verifies that the next n-1 consecutive pages of acceptable colors are also within the pool and free. As long as n suitable pages are not yet found, the loop condition checks that the previously found page is free and belongs to the pool (line 53), and the loop body identifies the following page of an acceptable color (line 55). The number of already found pages is maintained in the counter allocated (cf. lines 40, 49, 54).
The iteration of the loop on lines 48–58 stops when it has found n pages (that is, a free pset is found) or when the candidate page is outside the pool or allocated. If the candidate page is allocated, the search for a new pset restarts just after the last candidate page, as shown on line 55. (The fixes for lines 57 and 71 are discussed in Sect. 5.)
If the function successfully finds n pages (line 59), it marks these pages as allocated (loop on lines 62–65), updates the number of allocated pages to n (line 60), sets the address of the first page in ppages (line 61), updates the address of the last allocated page of the pool (line 66), and, finally, returns.
If the function does not find n pages, it returns with ppages containing zero allocated pages, as set initially on line 44. The function always terminates and examines all pages in the pool (at least in the second phase).
5 Bugs, Corrections and Further Optimizations
Bugs and fixes. The current version of pp_alloc_clr, contrary to its intended behavior, does not guarantee that the returned set is indeed a free pset of n pages in the pool. In some intricate cases, depending on the status of the pages, the n-th page might be already in use or outside the pool. The first case may break memory isolation, while the second case may cause the VM to crash.
The bugs reside in the selection of the first page of a candidate pset: the function may choose a first page whose color is not acceptable. Indeed, the loop calculating the first page (lines 50–51) only checks that the page is free, without checking its color. This is sufficient for the very first execution or if the loop has already been executed at least once, as a call to pp_next_clr (resp., on line 45 or 51)—to select the new candidate page—guarantees it has an acceptable color.
However, if the function fails to find a pset of size n, it wrongly starts a new search from the page following the last candidate page (see line 57), whose color may be unacceptable. Additionally, during the second phase, the function starts searching from page index 0 (line 71), which might also have an unacceptable color. In these two faulty cases, if the candidate page is free, it will be selected as the first page of a tentative pset (lines 50–52).
To fix these bugs, we should ensure that the first page has an acceptable color before entering the loop. We propose two bug fixes: we remove line 57 and modify line 71 to index=pp_next_clr(pool->base, 0, colors);. These bugs were discovered during the formal specification step, and the fixed version was formally proved with Wp.
Counterexample. To illustrate the first bug, consider the mock pool of Fig. 2 on which pp_alloc_clr is called to find a free pset of size 2 for the blue color with pool->last==p0. The function will succeed and wrongly return (the address of page) p4 in ppages->base as the first page of a pset. Recall (cf. Sect. 4.1) that in higher-level functions the pages are assigned to a VM via consecutive calls to pp_next_clr starting from the first page (like on lines 62–65). The VM will receive pages p5 and p7, the latter being potentially already allocated to another VM! This counterexample (along with another one, due to the second bug) was formally confirmed in Frama-C with the static value analysis plugin Eva [33]. Eva was used to confirm the undesired situation (described with a few ACSL annotations that were proved by Eva) to avoid any risk of misinterpretation of the code. The counterexamples can be found in the companion artifact [27].
Suggestions of optimizations. It would be sufficient for the second phase in the outer loop on lines 47–74 (cf. Sect. 4.5) to perform the search of the first pset page untilpool->last, instead of uselessly performing a full search until the end of the pool (and re-exploring the pages tried in the first phase). This can be done, for instance, by adding if(i==1\(\wedge \)index\(\ge ~\)pool->last)returnok; as a second instruction in the body of the loop on lines 50–51. Another optimization can be to perform direct jumps to the first page of the next color without enumerating all pages (as it is done on lines 26–27 in function pp_next_clr, very frequently called). This can be realized e.g. with a precomputed array of jumps, based on the number offset of the current page inside its periodic block of colors. We plan to submit these and some other suggestions to Bao developers before integrating them into the code under verification.
6 Verification of Cache Coloring
This section presents key specification and verification points and the results of the case study. Its full annotated code can be found in the companion artifact [27]. We mainly focus in the paper on the verification of the key functions presented in Fig. 3. The specified and verified code also includes a simplified version of two higher-level functions (pp_alloc_ppages and mem_map), which were verified to get confidence in consistency of the proposed contracts for the key functions with the expected behavior in the callers. For an easier navigation, unless otherwise stated, the line numbers in the figures and text below are kept as in the full annotated code.
6.1 Basic Predicates and Flattening Invariants
Fig. 4.
Predicates used in the specification of the cache coloring mechanism of Bao.
In this case study (cf. Fig. 3, lines 17, 22), we consider a 64-bit implementation with \(2^{12}\)-byte pages and a maximum number of pages of \(2^{52}\), which aligns with the maximum number of pages supported by most 64-bit architectures. Additionally, we consider 64-bit long color vectors, which sets the maximal number of colors to 64 accordingly, and we do not impose any prior constraints on COLOR_SIZE to ensure compatibility with a wide range of hardware configurations, as specified in the definition of predicate ValidCacheCfg (Fig. 4, line 40).
Predicate IsValidPool (line 41) ensures that pool represents a valid segment of memory, and that its bitmap is sufficiently large to store the status of its pages and does not overlap with the pool structure. Macros P_NB and P_NB_MAX were defined in Fig. 3, lines 21–22.
Earlier verification efforts with Frama-C (e.g. [23]) demonstrated that reasoning on array cells instead of bits makes solvers more efficient. We introduce a global companion ghost array u8 gPStatus[P_NB_MAX] to store page allocation statuses, and express the equivalence between a bitmap and the companion array with predicate flatPoolStatus (Fig. 4, line 47). It guarantees that checking the i-th bit in the bitmap is equivalent to checking the i-th cell in gPStatus. A starting letter g (e.g. in gPStatus) indicates a ghost variable name in this work.
Similarly, we introduce a global companion ghost array "u8 gFlatClrs[64]" to flatten the color vector (unchanged in our scope4), and express the equivalence between a color vector and the companion array with predicate flatClrs (line 53), i.e. that checking the i-th bit in color vector clrs is equivalent to checking the i-th cell in gFlatClrs. Maintaining such flattening invariants in contracts enables expressing properties on array cells instead of bits.
Predicates IsInClrs and IsNotInClrs (lines 55, 56) state that color clr is, resp., acceptable and unacceptable w.r.t. the color vector encoded in gFlatClrs.
Predicate HasClrPages (line 57) states that array PArr of size n contains page number offsets (with respect to the base page p_base) of pages with acceptable colors. Labels L1 and L2 characterize, resp., the moment of calculation of the color and of reading the cell in PArr. Such a distinction of labels will often be used in predicates below. Logic function P_CLR(PNum) computes the color of a given page as in (\(\textrm{A}\)) in Sect. 4.2. Predicate NoClrPBtw (line 60) states that there is no page of an acceptable color with number offset between start and end (excluded) with respect to p_base.
Predicate HasSeqPages (line 62) ensures that page number offsets stored in array PArr of size n are in ascending order, and any other page between them does not have an acceptable color. Predicate HasPagesInPool (line 66) states that page number offsets stored in array PArr of size n are within the memory pool pointed to by pool.
Predicate PSetInPool (line 69) states that the page number offsets stored in array PArr of size n are within the memory pool pointed to by pool, consecutive and of an acceptable color. In other words, array PArr is a pset of size n for colors inside pool.
Predicate HasFreePages (line 74) states that pages with page number offsets stored in array PArr of size n are free according to gPStatus. Likewise, predicate HasAllocPages (line 77) states that those pages are allocated.
In order to verify the code with the deductive verification plugin Wp of Frama-C, we provide an ACSL specification for each of the considered functions. We overview here the contracts of pp_next_clr and pp_alloc_clr.
6.2 Specification of pp_next_clr
Fig. 5.
Contract of pp_next_clr.
Preconditions. Given the scope of our verification, we bound the range of the page number offset from between 0 and \(2^{52}\) (excluded), assuming there can be a single memory pool supporting up to \(2^{52}\) pages (Fig. 5, line 145). We did not need to impose specific constraints on the address base since we are considering 12-bit wide pages and 52-bit wide page numbers, so the address is naturally bounded by its type. Predicate ValidCacheCfg (line 144) specifies the considered arithmetic constraints. The equivalence between the color vector and the companion array must hold before and after the call (lines 146, 151). Finally, recall that callers ensure that the color vector accepts at least one color (cf. Sect. 4.3). To guarantee termination, we express this constraint on line 148, where the existential property is replaced by a witness—a global ghost variable gClrValid—since this value will be used in ghost code inside the function. We preferred this (simple and sufficient) option for our scope to the alternative when a witness has to be found inside the function from an existential precondition.
Postconditions. We express that the function always terminates (line 149) and does not modify the memory (line 150). Finally, we express the functional properties. The returned page has an acceptable color (line 152) and is the closest page with this property to from, the starting page of the search (line 153); Line 154 gives an interval of values for the result, a maximum offset being the size of a color block (line 154). This upper bound is tight5 and suffices to prove the absence of overflows during the update of index.
6.3 Specification of pp_alloc_clr
Preconditions. The preconditions (omitted in the paper) are relatively natural and mostly similar to those of pp_next_clr. An interval of values is specified for the number of allocated pages n, and the validity of ppages is required. The validity of pool and flattening invariants are present both in preconditions and postconditions (the latter on lines 259–261 in Fig. 6).
Additionally, we had to add a dozen of explicit separation clauses (omitted in the paper) between the arguments and the ghost variables. Some of these separation predicates are likely to become unnecessary in a future version of Frama-C/Wp that will be capable to deduce that the modification of ghost variables cannot impact non-ghost variables, and vice versa.
Fig. 6.
Selected postconditions of the contract of pp_alloc_clr.
Postconditions. At the end of the function, there are two possible return values, 0 and 1 (line 258). Other notable postconditions fall into two categories: those for the success case and those for the failure case.
Predicates that hold on success (when the function returns 1) must ensure that subsequent calls in the callers (cf. Sect. 4.1) to pp_next_clr starting from the first allocated page—the only page returned in ppages—will really return pages of a required pset in the pool (whose pages were free before the call and then marked as allocated by the function). Since ACSL does not allow using the C function pp_next_clr in the specification, we used predicates over the selected pages. We capture these pages by their number offsets (with respect to the starting page of the pool) in a global ghost array u64 gFoundPSet[P_NB_MAX], by adding ghost code into the function. It is another illustration of an advantageous usage of ghost code artifacts for the specificaiton.
Precisely, we state that n pages in gFoundPSet were free before the call (lines 268–269) and are now allocated (line 272); and constitute a pset of size n for colors (line 270–271). Moreover the ppages structure must contain n pages and store the page address corresponding to the first cell of the gFoundPSet (line 273–274).
In case of a failure, the function returns 0 (line 279), it has not modified the page status array (lines 280–281) nor the pool (line 282–283).
Specification completeness. The completeness and disjointness of the two cases of the specification was non-trivial to ensure because of the complexity of the calling cases: either there exists a suitable subset of pages—free pset—on entry, or not. As the size of the subset depends on an argument of the function, the conditions involve an undetermined number of pages. Expressing such properties with an undetermined number of quantifiers is not directly allowed in ACSL and would only be possible indirectly (e.g. with a list, an array, or a set). However, since solvers often have issues with complex conditions involving multiple quantifiers, we decided to adopt another, more pragmatic approach.
We decided to represent the existence of a suitable subset of pages through the existence of a witness array containing the number offsets of the pages. Thus, we stated the existence case assumption by giving a witness pset in a global companion ghost array u64 gExistPSet[P_NB_MAX], see lines 264–265. This establishes that the function returns 1 in this case. But this unique implication is not sufficient: the function could still return 0 while a suitable free pset existed on entry.
That is why we added another clause (lines 266–267) stating that the companion ghost array gFoundPSet mentioned above—with its values on exit—was a suitable pset (of size n with acceptable colors inside pool) already on entry. Along with lines 268–269, we deduce a condition similar to that on the left of the implication on lines 264–265.
Recall that the first label indicates when the property is evaluated while the second label indicates at which state the array values are read. Notice that, while the aforementioned clause on lines 270–271 looks similar, strictly speaking, it does not directly state the same property as on lines 266–267, since it considers the property at label Post instead of Pre (the values of gFoundPSet being, of course, considered at Post in both cases as it is computed during the function).
Therefore, we can deduce that the function returns 1if and only if a suitable subset of pages existed on entry, before the call. As the function can only return 0 or 1 (line 258), our specification of both cases is complete and disjoint.
We did not use ACSL behaviors because Frama-C would not be capable to prove that behaviors are complete and disjoint for this version of specification. That is why we justify it here with an additional argument, external to Frama-C.
6.4 Selected Aspects and Difficulties of the Proof
The proof required carefully chosen predicates, ghost code and ghost variables, loop invariants, assertions and lemmas. The predicates, ghost variables and our approach to ensure the completeness of the specification of pp_alloc_clr were presented above. A companion ghost model and flattening invariants helped to efficiently deal with bit-level operations. This section presents some other selected aspects and verification choices.
Termination of. To prove termination, we compute (in ghost code) an upper bound for the number offset index using the witness color gClrValid. We distinguish two cases of relative position of the starting page in its periodic color block: either its color lies before the existing acceptable color gClrValid or after it in (in the latter case, the upper bound is in the next color block).
Proof of. With three levels of nested loops, a significant number of carefully chosen loop invariants was necessary. For instance, to prove that the function finds a pset in case there exists a suitable pset in the pool (lines 264–265), we have to ensure that if such a pset exists in the pool, then it is located in the part of the pool the function has not explored so far. Thus, if the function fails to find such a pset after going through the entire pool, then the existing pset must be located outside the memory, which is contradictory. Due to the structure of the function, we express this property in the main loop and then recursively in the nested loops to have it preserved. Figure 7 shows the invariants for two of them. We constrain only the second phase since it runs—in the current version—a full search from the beginning of the pool, that explains the condition i==1. The second invariant is relatively tricky. Indeed, the loop on lines 53–56 in Fig. 3—that attempts to complete a previously identified first page to a full pset—may possibly find the witness pset pExistPSet or another existing pset starting before it. To address this, we adjust the loop invariant by stating that we did not miss the witness psetpExistPSet: if the current candidate page is greater or equal to the first page of pExistPSet, then it lies inside it (line 375).
Fig. 7.
Loop invariants for the loops on lines 48–58 (above) and lines 53–56 (below) of pp_alloc_clr (in Fig. 3), used to prove the success when a pset exists on entry.
Arithmetic lemmas. As page colors are computed with modulo and division operations, reasoning about them involves such arithmetic operations. The solvers we used were unable to handle them directly. To address this issue, we introduced four arithmetic lemmas, and had to prove three of them in Coq (see the companion artifact [27] for Coq proof scripts). One lemma, proving the equivalence of (B) and (C) (see Sect. 4.3), is shown in Fig. 8.
Fig. 8.
One of the four arithmetic lemmas used in the proof of pp_next_clr.
Separation issues andFrama-C’s memory model. The need for additional separation clauses (in particular, between ghost and non-ghost variables) was already mentioned in Sect. 6.3. In many parts of the code, we also encountered difficulties in proving the preservation of seemingly trivial properties through assignments. These difficulties stem from the memory model used in Frama-C/Wp, where pointers are treated as indices within arrays, where cells correspond to the pointed values. Consequently, properties involving pointers in ACSL are translated into properties over arrays in Wp. When a pointed value is modified, the whole array is seen as possibly modified, making proofs non-trivial for solvers. To prove such properties, we often had to manually create proof scripts in Wp to demonstrate that the pointed values used in predicates remain unchanged through assignments. This process introduced a significant specification and verification overhead making the verification process more complex to maintain.
Semantic lemmas. To show the preservation of the PSetInPool predicate between two program points despite the modification of some variables, a preservation lemma was necessary (lemma PSetInPool_preserved in the companion artifact [27]). While the idea is well-known, a very careful formulation with four labels was necessary since each predicate has two labels. Moreover, its proof required a manually crafted proof script in Wp with carefully selected tactics.
Function pp_next_clr must ensure that the pages of the found free pset eventually become allocated. This task is handled in the loop on lines 62–65 in Fig. 3, which iterates through the found page indices and marks them as allocated. However, the function moves to the next page of the pset by calculating it through a call to pp_next_clr. To ensure that the function gets the same page indices as that of the free pset found earlier, another interesting lemma (lemma unique_next_clr_page in the companion artifact [27]) was necessary.
Linked lists. During the verification of higher-level function pp_alloc_ppages, which looks for a free pset of a given size in a set of pools represented by a linked list of pools (as mentioned in Sect. 4.2), an additional difficulty was related to linked lists. Indeed, contrary to simple linked lists, in our case list nodes contain several data fields and pointers to external arrays. Broadly inspired by previous work [15, 16, 36], this issue was solved using a companion ghost array containing the addresses of the nodes of the linked list and by defining and maintaining a suitable linking predicate, which establishes the link between them. Detailed specifications are available in the companion artifact [27].
Unstable proof scripts inWp. During the last stages of the case study, we discovered an issue in Frama-C/Wp related to proof scripts. A created script, which leads to a successful proof at the time of its creation, fails with error messages during the proof replay. Presumably, this comes from a different degree of proof goal simplifications during the script creation and the proof replay, resulting in slight differences in the proof goal. This issue has been reported to the Wp team.
6.5 Proof Statistics
This verification case study took approximately three months of intensive work, including understanding the implementation, formal specification, verification, detecting and fixing the bugs, readability improvements and restructuring of the specification for the paper. Formal verification was, initially, carried out on the four key functions: pp_next_clr, bitmap_get, bitmap_set and pp_alloc_clr. To ensure the relevance of the proposed contracts, formal specification and verification for simplified versions of two upper-level functions, mem_alloc_ppages and mem_map, were realized as well, focusing on the mapping of colored pages (and excluding other behavior e.g. when cache coloring is deactivated). The former one searches for a pset in a linked list of pools, while the latter calls the former to assign a pset of pages to a VM. They are not detailed in the paper, but the annotated code is available in the companion artifact [27]. The claim that formal verification is complete can be demonstrated with the artifact.
The specified functions total, approximately, 100 lines of C code and 600 lines of ACSL. ACSL annotations include ghost code (20 lines), predicates (100 lines), contracts (455 lines), assertions (30 lines), and lemmas (25 lines).
The proof goals include function contracts, assertions, lemmas, the absence of run-time errors, smoke tests (to detect potential specification inconsistencies), and memory hypotheses made by Wp ’s typed memory model; they result in 463 proof goals and 60 extra goals for smoke tests; that is, 523 in total.
The proof was carried out with Frama-C v.29.0 and Why3 1.7.2, with the external solvers Alt-Ergo 2.5.4, CVC5 1.0.9 and Z3 4.8.12 (run in that order), and the proof assistant Coq 8.18.0. The proofs were run on a desktop computer running Ubuntu 22.04.5 LTS, with an Intel® Core™ i5-1145G7 CPU @ 2.60 GHz, featuring 4 cores 8 threads, with 32 GB RAM. We ran Frama-C/Wp with options-wp-par=8 and-wp-timeout=40.
The full proof takes approx. 5 minutes. All smoke tests passed. Over the 523 goals6, around 1% (6) were discharged by control-flow analysis; around 83% of the goals (433) were proved by automatic solvers: the internal simplifier engine Qed of Wp handled around 53% of the goals (277) in an average time of 146ms per goal, then Alt-Ergo discharged around 28% of the goals (147) in an average time of 110ms, CVC5 covered around 5% of the goals (26) in an average time of 725ms, Z3 proved 2% of the goals (9) in an average time of 1.4s. Around 11% of the goals (55) were achieved through proof scripts in Wp, while less than 1% of the goals (3) were proved in Coq. At the end of the case study, when the authors were used to proof contexts, the scripts in Wp required around 5 hours to be fully re-created manually, which was often necessary after code and specification updates. The proof scripts in Coq required a couple of hours to be created manually (and did not need to be re-created after the first attempt).
7 Conclusion and Future Work
Related work. A number of hypervisors are in use today. Some are used in IT infrastructures (e.g cloud) for their flexibility and dynamic resource management such as Xen [6], VMWare [4] or KVM [3]. Others are better suited to critical embedded systems such as Xen Dom0-less [5], Jailhouse [2] or Xtratum [7]. In this case, it is the static resource sharing property that is exploited. The use of hypervisors in critical embedded systems requires a high level of confidence in resource allocation, and particularly in maintaining isolation between VMs. Formal verification has been applied to provide high confidence in some resource allocation systems, such as ProvenCore [17] ans seL4 [32]. To the best of our knowledge, formal verification of cache coloring has never been addressed in previous work.
More generally, this work is related to other verification case studies on real-life code and empirical evaluations of verification tools [29]. Among other examples, the KeY tool was used for verification of several libraries and applications in Java [12, 22]. Verification of a traffic tunnel control system [38] was realized with VerCors [8]. Verification for a real-world avionics example [25] and for security properties [23] provided useful feedback on using Frama-C. SPARK was used in the verification of a TCP Stack [19] and complex datastructures [26]. Verification of the Hyper-V hypervisor with VCC [34] highlighted some issues specific to hypervisor verification. Deductive verification of smart contracts [18] was realized with Dafny [35]. Several case studies [39] were performed using VeriFast [30]. Each new case study contributes to enhance verification tools by identifying their limitations and to push further the frontiers of what is achievable for formal verification.
Conclusion. This paper has presented a formal verification case study for an original, industrially relevant and security-critical target—the cache coloring in Bao. We have given its pedagogical presentation and emphasized main aspects of its verification with Frama-C. The target code is very elegant but challenging for deductive verification (containing bit-level operations, non-trivial logic, complex arithmetic operations, multiple nested loops, linked lists). This case study contributes to a better understanding of the capacities of modern deductive verifiers. It also allowed us to identify and fix two bugs in the target code, to suggest its further optimizations, and to discover a minor issue in the verification tool.
Future work. Future work includes the verification of optimized versions of cache coloring and a larger verification of critical parts of the Bao hypervisor, with a long-term goal to reach a highly optimized, provably correct static hypervisor ensuring strong isolation properties and suitable for modern embedded systems. Another work direction is to enhance automatic proof script generation [21].
Data availability statement. The companion artifact [27] contains the annotated code, counterexamples and a virtual machine (with all necessary tools installed), ready to reproduce the proof.
Acknowledgment
Part of this work was supported by ANR (grants ANR-22-CE39-0014, ANR-22-CE25-0018). We warmly thank Téo Bernier for his valuable advice and help, Allan Blanchard, Loïc Correnson and Frédéric Loulergue for fruitful discussions, the whole Frama-C team for their support, and the anonymous referees for helpful comments.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Shortly before the final submission of this paper, the authors reported the bugs and the suggested fixes to Bao developers, who integrated the proposed fixes into the code (commit ee73f7e in the Bao repository [1] on January 6, 2025).
The per-solver results are given as an indication of a possible proof run, can vary and should not be used to compare solvers or draw any conclusions about their relative efficiency; our purpose was to reach a full proof and not to compare the solvers.