Top

Journal of Automated Reasoning

Published in:

Open Access 12-02-2019

Priority Inheritance Protocol Proved Correct

Authors: Xingyuan Zhang, Christian Urban, Chunhan Wu

Published in: Journal of Automated Reasoning | Issue 1/2020

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Patentsearch

Off

Abstract

In real-time systems with threads, resource locking and priority scheduling, one faces the problem of Priority Inversion. This problem can make the behaviour of threads unpredictable and the resulting bugs can be hard to find. The Priority Inheritance Protocol is one solution implemented in many systems for solving this problem, but the correctness of this solution has never been formally verified in a theorem prover. As already pointed out in the literature, the original informal investigation of the Property Inheritance Protocol presents a correctness “proof” for an incorrect algorithm. In this paper we fix the problem of this proof by making all notions precise and implementing a variant of a solution proposed earlier. We also generalise the scheduling problem to the practically relevant case where critical sections can overlap. Our formalisation in Isabelle/HOL is based on Paulson’s inductive approach to protocol verification. The formalisation not only uncovers facts overlooked in the literature, but also helps with an efficient implementation of this protocol. Earlier implementations were criticised as too inefficient. Our implementation builds on top of the small PINTOS operating system used for teaching.

This paper is a revised, corrected and expanded version of [31]. In Sect. 4 we improve our previous result by proving a finite bound for Priority Inversion. Moreover, we are giving in this paper more details about our proof and describe some of our (unverified) C-code for implementing the Priority Inversion Protocol, as well as surveying the existing literature in more depth. Our C-code follows closely all results we proved about optimisations of the Priority Inheritance Protocol.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Many real-time systems need to support threads involving priorities and locking of resources. Locking of resources ensures mutual exclusion when accessing shared data or devices that cannot be preempted. Priorities allow scheduling of threads that need to finish their work within deadlines. Unfortunately, both features can interact in subtle ways leading to a problem, called Priority Inversion. Suppose three threads having priorities H(igh), M(edium) and L(ow). We would expect that the thread H blocks any other thread with lower priority and the thread itself cannot be blocked indefinitely by threads with lower priority. Alas, in a naive implementation of resource locking and priorities, this property can be violated. For this let L be in the possession of a lock for a resource that H also needs. H must therefore wait for L to exit the critical section and release this lock. The problem is that L might in turn be blocked by any thread with priority M, and so H sits there potentially waiting indefinitely (consider the case where threads with priority M continuously need to be processed). Since H is blocked by threads with lower priorities, the problem is called Priority Inversion. It was first described in [12] in the context of the Mesa programming language designed for concurrent programming.

If the problem of Priority Inversion is ignored, real-time systems can become unpredictable and resulting bugs can be hard to diagnose. The classic example where this happened is the software that controlled the Mars Pathfinder mission in 1997 [21]. On Earth, the software ran mostly without any problem, but once the spacecraft landed on Mars, it shut down at irregular, but frequent, intervals. This led to loss of project time as normal operation of the craft could only resume the next day (the mission and data already collected were fortunately not lost, because of a clever system design). The reason for the shutdowns was that the scheduling software fell victim to Priority Inversion: a low priority thread locking a resource prevented a high priority thread from running in time, leading to a system reset. Once the problem was found, it was rectified by enabling the Priority Inheritance Protocol (PIP) [24]¹ in the scheduling software.

The idea behind PIP is to let the thread L temporarily inherit the high priority from H until L leaves the critical section unlocking the resource. This solves the problem of H having to wait indefinitely, because L cannot be blocked by threads having priority M. While a few other solutions exist for the Priority Inversion problem, PIP is one that is widely deployed and implemented. This includes VxWorks (a proprietary real-time OS used in the Mars Pathfinder mission, in Boeing’s 787 Dreamliner, Honda’s ASIMO robot, etc.) and ThreadX (another proprietary real-time OS used in nearly all HP inkjet printers [28]), but also the POSIX 1003.1c Standard realised for example in libraries for FreeBSD, Solaris and Linux.

Two advantages of PIP are that it is deterministic and that increasing the priority of a thread can be performed dynamically by the scheduler. This is in contrast to Priority Ceiling [24], another solution to the Priority Inversion problem, which requires static analysis of the program in order to prevent Priority Inversion, and also in contrast to the approach taken in the Windows NT scheduler, which avoids this problem by randomly boosting the priority of ready low-priority threads (see for instance [2]). However, there has also been strong criticism against PIP. For instance, PIP cannot prevent deadlocks when lock dependencies are circular, and also blocking times can be substantial (more than just the duration of a critical section). Though, most criticism against PIP centres around unreliable implementations and PIP being too complicated and too inefficient. For example, Yodaiken writes in [30]:

“Priority inheritance is neither efficient nor reliable. Implementations are either incomplete (and unreliable) or surprisingly complex and intrusive.”

He suggests avoiding PIP altogether by designing the system so that no priority inversion may happen in the first place. However, such ideal designs may not always be achievable in practice.

In our opinion, there is clearly a need for investigating correct algorithms for PIP. A few specifications for PIP exist (in informal English) and also a few high-level descriptions of implementations (e.g. in the textbooks [15, Section 12.3.1] and [26, Section 5.6.5]), but they help little with actual implementations. That this is a problem in practice is proved by an email by Baker, who wrote on 13 July 2009 on the Linux Kernel mailing list:

“I observed in the kernel code (to my disgust), the Linux PIP implementation is a nightmare: extremely heavy weight, involving maintenance of a full wait-for graph, and requiring updates for a range of events, including priority changes and interruptions of wait operations.”

The criticism by Yodaiken, Baker and others suggests another look at PIP from a more abstract level (but still concrete enough to inform an implementation), and makes PIP a good candidate for a formal verification. An additional reason is that the original specification of PIP [24], despite being informally “proved” correct, is actually flawed.

Yodaiken [30] and also Moylan et al. [16] point to a subtlety that had been overlooked in the informal proof by Sha et al. They specify PIP in [24, Section III] so that after the thread (whose priority has been raised) completes its critical section and releases the lock, it “returns to its original priority level”. This leads them to believe that an implementation of PIP is “rather straightforward” [24]. Unfortunately, as Yodaiken and Moylan et al. point out, this behaviour is too simplistic. Moylan et al. write that there are “some hidden traps” [16]. Consider the case where the low priority thread L locks two resources, and two high-priority threads H and \(H'\) each wait for one of them. If L releases one resource so that H, say, can proceed, then we still have Priority Inversion with \(H'\) (which waits for the other resource). The correct behaviour for L is to switch to the highest remaining priority of the threads that it blocks. A similar error is made in the textbook [20, Section 2.3.1] which specifies for a process that inherited a higher priority and exits a critical section that “it resumes the priority it had at the point of entry into the critical section”. This error can also be found in the textbook [14, Section 16.4.1] where the authors write about this process: “its priority is immediately lowered to the level originally assigned”; and also in the more recent textbook [13, Page 119] where the authors state: “when [the task] exits the critical section that caused the block, it reverts to the priority it had when it entered that section”. The textbook [15, Page 286] contains a similar flawed specification and even goes on to develop pseudo-code based on this flawed specification. Accordingly, the operating system primitives for inheritance and restoration of priorities in [15] depend on maintaining a data structure called inheritance log. This log is maintained for every thread and broadly specified as containing “[h]istorical information on how the thread inherited its current priority” [15, Page 527]. Unfortunately, the important information about actually computing the priority to be restored solely from this log is not explained in [15] but left as an “exercise” to the reader. As we shall see, a correct version of PIP does not need to maintain this (potentially expensive) log data structure at all. Surprisingly also the widely read and frequently updated textbook [25] gives the wrong specification. On Page 254 the authors write: “Upon releasing the lock, the [low-priority] thread will revert to its original priority.” The same error is also repeated later in this popular textbook.

While [13‐15, 20, 24, 25] are the only formal publications we have found that specify the incorrect behaviour, it seems also many informal descriptions of the PIP protocol overlook the possibility that another high-priority process might wait for a low-priority process to finish. A notable exception is the textbook [3], which gives the correct behaviour of resetting the priority of a thread to the highest remaining priority of the threads it blocks. This textbook also gives an informal proof for the correctness of PIP in the style of Sha et al. Unfortunately, this informal proof is too vague to be useful for formalising the correctness of PIP and the specification leaves out nearly all details in order to implement PIP efficiently.

Contributions There have been earlier formal investigations into PIP [8, 10, 29], but they employ model checking techniques. This paper presents a formalised and mechanically checked proof for the correctness of PIP. For this we needed to design a new correctness criterion for PIP. In contrast to model checking, our formalisation provides insight into why PIP is correct and allows us to prove stronger properties that, as we will show, can help with an efficient implementation of PIP. We illustrate this with an implementation of PIP in the educational operating system PINTOS [19]. For example, we found by “playing” with the formalisation that the choice of the next thread to take over a lock when a resource is released is irrelevant for PIP being correct—a fact that has not been mentioned in the literature and not been used in the reference implementation of PIP in PINTOS. This fact, however, is important for an efficient implementation of PIP, because we can give the lock to the thread with the highest priority so that it terminates more quickly. We are also able to generalise the scheduler of Sha et al. [24] to the practically relevant case where critical sections can overlap; see Fig. 1a for an example of this restriction. In the existing literature there is no proof and also no proof method that covers this generalised case.

2 Formal Model of the Priority Inheritance Protocol

The Priority Inheritance Protocol, short PIP, is a scheduling algorithm for a single-processor system.² Following good experience in earlier work [27], our model of PIP is based on Paulson’s inductive approach for protocol verification [18]. In this approach a state of a system is given by a list of events that happened so far (with new events prepended to the list). Events of PIP fall into five categories defined as the Isabelle datatype:

whereby threads, priorities and (critical) resources are represented as natural numbers. In what follows we shall use

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fign_HTML.gif

as a name for critical resources. The event

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figo_HTML.gif

models the situation that a thread obtains a new priority given by the programmer or user (for example via the nice utility under UNIX). For states we define the following type-synonym:

As in Paulson’s work, we need to define functions that allow us to make some observations about states. One function, called

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figq_HTML.gif

, calculates the set of “live” threads that we have seen so far in a state:

In this definition

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figs_HTML.gif

stands for list-cons and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figt_HTML.gif

for the empty list. We use

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figu_HTML.gif

to match any pattern, like in functional programming. Another function calculates the priority for a thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figv_HTML.gif

, which is defined as

In this definition we set

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figx_HTML.gif

as the default priority for threads that have not (yet) been created. The last function we need calculates the “time”, or index, at which time a thread had its priority last set.

In this definition

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figz_HTML.gif

stands for the length of the list of events

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figaa_HTML.gif

. Again the default value in this function is

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figab_HTML.gif

for threads that have not been created yet. An actor of an event is defined as

This allows us to filter out the actions a set of threads

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figad_HTML.gif

perform in a list of events

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figae_HTML.gif

, namely

where we use Isabelle’s notation for list-comprehensions. This notation is very similar to the notation used in Haskell for list-comprehensions. A precedence of a thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figag_HTML.gif

in a state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figah_HTML.gif

is the pair of natural numbers defined as

We also use the abbreviation

for the precedences of a set of threads

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figak_HTML.gif

in state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figal_HTML.gif

. The point of precedences is to schedule threads not according to priorities (because what should we do in case two threads have the same priority), but according to precedences. Precedences allow us to always discriminate between two threads with equal priority by taking into account the time when the priority was last set. We order precedences so that threads with the same priority get a higher precedence if their priority has been set earlier, since for such threads it is more urgent to finish their work. In an implementation this choice would translate to a quite straightforward FIFO-scheduling of threads with the same priority.

Moylan et al. [16] considered the alternative of “time-slicing” threads with equal priority, but found that it does not lead to advantages in practice. On the contrary, according to their work having a policy like our FIFO-scheduling of threads with equal priority reduces the number of tasks involved in the inheritance process and thus minimises the number of potentially expensive thread-switches.

Next, we introduce the concept of waiting queues. They are lists of threads associated with every resource. The first thread in this list (i.e. the head, or short

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figam_HTML.gif

) is chosen to be the one that is in possession of the “lock” of the corresponding resource. We model waiting queues as functions, below abbreviated as

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figan_HTML.gif

. They take a resource as argument and return a list of threads. This allows us to define when a thread holds, respectively waits for, a resource

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figao_HTML.gif

given a waiting queue function

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figap_HTML.gif

In this definition we assume that

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figar_HTML.gif

converts a list into a set. Note that in the first definition the condition about

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figas_HTML.gif

does not follow from

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figat_HTML.gif

, since the head of an empty list is undefined in Isabelle/HOL. At the beginning, that is in the state where no thread is created yet, the waiting queue function will be the function that returns the empty list for every resource.

Using

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figav_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figaw_HTML.gif

, we can introduce Resource Allocation Graphs (RAG), which represent the dependencies between threads and resources. We choose to represent

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figax_HTML.gif

s as relations using pairs of the form

where the first stands for a waiting edge and the second for a holding edge (

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figaz_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figba_HTML.gif

are constructors of a datatype for vertices). Given a waiting queue function, a

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbb_HTML.gif

is defined as the union of the sets of waiting and holding edges, namely

If there is no cycle, then every

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbd_HTML.gif

can be pictured as a forest of trees, as for example in Fig. 2.

Because of the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbe_HTML.gif

s, we will need to formalise some results about graphs. It seems for our purposes the most convenient representation of graphs are binary relations given by sets of pairs shown in (2). The pairs stand for the edges in graphs. This relation-based representation has the advantage that the notions

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbf_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbg_HTML.gif

are already defined in terms of relations amongst threads and resources. Also, we can easily re-use the standard notions for transitive closure operations

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbh_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbi_HTML.gif

, as well as relation composition for our graphs. While there are a few formalisations for graphs already implemented in Isabelle, we choose to introduce our own library of graphs for PIP. The justification for this is that we wanted to have a more general theory of graphs which is capable of representing potentially infinite graphs (in the sense of infinitely branching and infinite size): the property that our

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbj_HTML.gif

s are actually forests of finitely branching trees having only a finite depth should be something we can prove for our model of PIP—it should not be an assumption we build already into our model. A forest is defined in our representation as the relation

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbk_HTML.gif

that is single valued and acyclic:

The children, subtree and ancestors of a node in a graph can be easily defined relationally as

Note that forests can have trees with infinite depth and containing nodes with infinitely many children. A finite forest is a forest whose underlying relation is well-founded³ and every node has finitely many children (is only finitely branching).

The locking mechanism ensures that for each thread node, there can be many incoming holding edges in the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbn_HTML.gif

, but at most one out going waiting edge. The reason is that when a thread asks for a resource that is locked already, then the thread is blocked and cannot ask for another resource. Clearly, also every resource can only have at most one outgoing holding edge—indicating that the resource is locked. So if the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbo_HTML.gif

is well-founded and finite, we can always start at a thread waiting for a resource and “chase” outgoing arrows leading to a single root of a tree, which must be a ready thread.

The use of relations for representing

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbp_HTML.gif

s allows us to conveniently define the Thread Dependants Graph (TDG):

This definition is the relation that one thread is waiting for another to release a resource, but the corresponding resource is “hidden”. In Fig. 2 this means the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbr_HTML.gif

connects

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbs_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbt_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbu_HTML.gif

, which both wait for resource

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbv_HTML.gif

to be released; and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbw_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbx_HTML.gif

, which cannot make any progress unless

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figby_HTML.gif

makes progress. Similarly for the other threads. If there is a circle of dependencies in a

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figbz_HTML.gif

(and thus

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figca_HTML.gif

), then clearly we have a deadlock. Therefore when a thread requests a resource, we must ensure that the resulting

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcb_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcc_HTML.gif

are not circular. In practice, the programmer has to ensure this. Our model will enforce that critical resources can only be requested provided no circularity can arise (but critical sections can overlap, see Fig 1).

Next we introduce the notion of the current precedence of a thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcd_HTML.gif

in a state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figce_HTML.gif

. It is defined as

While the precedence

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcg_HTML.gif

of any thread is determined statically (for example when the thread is created), the point of the current precedence is to dynamically boost this precedence, if needed according to PIP. Therefore the current precedence of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figch_HTML.gif

is given as the maximum of the precedences of all threads in its subtree (which includes by definition

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figci_HTML.gif

itself). Since the notion of current precedence is defined as the transitive closure of the dependent threads in the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcj_HTML.gif

, we deal correctly with the problem in the informal algorithm by Sha et al. [24] where a priority of a thread is lowered prematurely (see Introduction). We again introduce an abbreviation for current precedences of a set of threads, written

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figck_HTML.gif

. The next function, called

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcm_HTML.gif

, defines the behaviour of the scheduler. It will be defined by recursion on the state (a list of events); this function returns a schedule state, which we represent as a record consisting of two functions:

The first function is a waiting queue function (that is, it takes a resource

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figco_HTML.gif

and returns the corresponding list of threads that lock or wait for it); the second is a function that takes a thread and returns its current precedence [see the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcp_HTML.gif

in (5)]. We assume the usual getter and setter methods for such records.

In the initial state, the scheduler starts with all resources unlocked [the corresponding function is defined in (1)] and the current precedence of every thread is initialised with

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcq_HTML.gif

; that means

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcr_HTML.gif

. Therefore we have for the initial schedule state

The cases for

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figct_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcu_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcv_HTML.gif

are also straightforward: we calculate the waiting queue function of the (previous) state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcw_HTML.gif

; this waiting queue function

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcx_HTML.gif

is unchanged in the next schedule state—because none of these events lock or release any resource; for calculating the next

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcy_HTML.gif

, we use

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figcz_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figda_HTML.gif

defined above. This gives the following three clauses for

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdb_HTML.gif

More interesting are the cases where a resource, say

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdd_HTML.gif

, is requested or released. In these cases we need to calculate a new waiting queue function. For the event

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figde_HTML.gif

, we have to update the function so that the new thread list for

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdf_HTML.gif

is the old thread list plus the thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdg_HTML.gif

appended to the end of that list (remember the head of this list is assigned to be in the possession of this resource). This gives the clause

The clause for event

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdi_HTML.gif

is similar, except that we need to update the waiting queue function so that the thread that possessed the lock is deleted from the corresponding thread list. For this list transformation, we use the auxiliary function

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdj_HTML.gif

. A simple version of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdk_HTML.gif

would just delete this thread and return the remaining threads, namely

In practice, however, often the thread with the highest precedence in the list will get the lock next. We have implemented this choice, but later found out that the choice of which thread is chosen next is actually irrelevant for the correctness of PIP. Therefore we prove the stronger result where

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdm_HTML.gif

is defined as

where

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdo_HTML.gif

stands for Hilbert’s epsilon and implements an arbitrary choice for the next waiting list. It just has to be a list of distinct threads and contains the same elements as

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdp_HTML.gif

(essentially

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdq_HTML.gif

can be any reordering of the list

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdr_HTML.gif

). This gives for

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figds_HTML.gif

the clause: Having the scheduler function

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdu_HTML.gif

at our disposal, we can “lift”, or overload, the notions

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdv_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdw_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdx_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdy_HTML.gif

, and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figdz_HTML.gif

to operate on states only.

With these abbreviations in place we can derive the following two facts about

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figeb_HTML.gif

s and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figec_HTML.gif

, which are more convenient to use in subsequent proofs.

Next we can introduce the notion of a thread being

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figee_HTML.gif

in a state (i.e. threads that do not wait for any resource, which are the roots of the trees in the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figef_HTML.gif

, see Fig. 2). The

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figeg_HTML.gif

thread is then the thread with the highest current precedence of all ready threads.

In the second definition

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figei_HTML.gif

stands for the image of a set under a function. Note that in the initial state, that is where the list of events is empty, the set

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figej_HTML.gif

is empty and therefore there is neither a thread ready nor running. If there is one or more threads ready, then there can only be one thread running, namely the one whose current precedence is equal to the maximum of all ready threads. We use sets to capture both possibilities. We can now also conveniently define the set of resources that are locked by a thread in a given state and also when a thread is detached in a state (meaning the thread neither holds nor waits for a resource—in the RAG this would correspond to an isolated node without any incoming and outgoing edges, see Fig. 2): Finally we can define what a valid state is in our model of PIP. For example we cannot expect to be able to exit a thread, if it was not created yet. These validity constraints on states are characterised by the inductive predicate

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figel_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figem_HTML.gif

. We first give five inference rules for

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figen_HTML.gif

relating a state and an event that can happen next.

The first rule states that a thread can only be created, if it is not alive yet. Similarly, the second rule states that a thread can only be terminated if it was running and does not lock any resources anymore (this simplifies slightly our model; in practice we would expect the operating system releases all locks held by a thread that is about to exit). The event

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figep_HTML.gif

can happen if the corresponding thread is running.

This is because the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figer_HTML.gif

event is for a thread to change its own priority—therefore it must be running.

If a thread wants to lock a resource, then the thread needs to be running and also we have to make sure that the resource lock does not lead to a cycle in the RAG (the purpose of the second premise in the rule below). In practice, ensuring the latter is the responsibility of the programmer. In our formal model we brush aside these problematic cases in order to be able to make some meaningful statements about PIP.⁴

Similarly, if a thread wants to release a lock on a resource, then it must be running and in the possession of that lock. This is formally given by the last inference rule of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figet_HTML.gif

Note, however, that apart from the circularity condition, we do not make any assumption on how different resources can be locked and released relative to each other. In our model it is possible that critical sections overlap. This is in contrast to Sha et al. [24] who require that critical sections are properly nested (recall Fig. 1).

A valid state of PIP can then be conveniently be defined as follows:

This completes our formal model of PIP. In the next section we present a series of desirable properties derived from this model of PIP. This can be regarded as a validation of the correctness of our model.

3 The Correctness Proof

Sha et al. state their first correctness criterion for PIP in terms of the number of low-priority threads [24, Theorem 3]: if there are

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figew_HTML.gif

low-priority threads, then a blocked job with high priority can only be blocked a maximum of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figex_HTML.gif

times. Their second correctness criterion is given in terms of the number of critical resources [24, Theorem 6]: if there are

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figey_HTML.gif

critical resources, then a blocked job with high priority can only be blocked a maximum of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figez_HTML.gif

times. Both results on their own, strictly speaking, do not prevent indefinite, or unbounded, Priority Inversion, because if a low-priority thread does not give up its critical resource (the one the high-priority thread is waiting for), then the high-priority thread can never run. The argument of Sha et al. is that if threads release locked resources in a finite amount of time, then indefinite Priority Inversion cannot occur—the high-priority thread is guaranteed to run eventually. The assumption is that programmers must ensure that threads are programmed in this way. However, even taking this assumption into account, the correctness properties of Sha et al. are not true for their version of PIP—despite being “proved”. As Yodaiken [30] and Moylan et al. [16] pointed out: If a low-priority thread possesses locks to two resources for which two high-priority threads are waiting for, then lowering the priority prematurely after giving up only one lock, can cause indefinite Priority Inversion for one of the high-priority threads, invalidating their two bounds (recall the counter example described in the Introduction).

Even when fixed, their proof idea does not seem to go through for us, because of the way we have set up our formal model of PIP. One reason is that we allow critical sections, which start with a

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfa_HTML.gif

-event and finish with a corresponding

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfb_HTML.gif

-event, to arbitrarily overlap (something Sha et al. explicitly exclude). Therefore we have designed a different correctness criterion for PIP. The idea behind our criterion is as follows: for all states

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfc_HTML.gif

, we know the corresponding thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfd_HTML.gif

with the highest precedence; we show that in every future state (denoted by

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfe_HTML.gif

) in which

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figff_HTML.gif

is still alive, either

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfg_HTML.gif

is running or it is blocked by a thread that was alive in the state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfh_HTML.gif

and was waiting for or in the possession of a lock in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfi_HTML.gif

. Since in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfj_HTML.gif

, as in every state, the set of alive threads is finite,

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfk_HTML.gif

can only be blocked by a finite number of threads.

However, the theorem we are going to prove hinges upon a number of natural assumptions about the states

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfl_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfm_HTML.gif

, the thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfn_HTML.gif

and the events happening in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figfo_HTML.gif

. We list them next:

Assumptions on the statesand We need to require that and are valid states:

×

Assumptions on the thread The thread must be alive in and has the highest precedence of all alive threads in . Furthermore the priority of is (we need this in the next assumptions).

×

Assumptions on the events in To make sure has the highest precedence we have to assume that events in can only create (respectively set) threads with equal or lower priority than of . For the same reason, we also need to assume that the priority of does not get reset and all other reset priorities are either less or equal. Moreover, we assume that does not get “exited” in . This can be ensured by assuming the following three implications.

×

The locale mechanism of Isabelle helps us to manage conveniently such assumptions [9]. Under these assumptions we shall prove the following correctness property:

Theorem 1

Given the assumptions about states

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figgk_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figgl_HTML.gif

, the thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figgm_HTML.gif

and the events in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figgn_HTML.gif

, then either

or
there exists a thread with and such that , and .

This theorem ensures that the thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figgv_HTML.gif

, which has the highest precedence in the state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figgw_HTML.gif

, is either running in state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figgx_HTML.gif

, or can only be blocked in the state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figgy_HTML.gif

by a thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figgz_HTML.gif

that already existed in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figha_HTML.gif

and is waiting for a resource or had a lock on at least one resource—that means the thread was not detached in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighb_HTML.gif

. As we shall see shortly, that means there are only finitely many threads that can block

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighc_HTML.gif

in this way.

The next lemma is part of the proof for Theorem 1: Given our assumptions (on

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighd_HTML.gif

), the first property we show that a running thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighe_HTML.gif

must either wait for or hold a resource in state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighf_HTML.gif

Lemma 1

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighg_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighh_HTML.gif

then

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighi_HTML.gif

Proof

Let us assume otherwise, that is

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighj_HTML.gif

is detached in state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighk_HTML.gif

, then, according to the definition of detached,

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighl_HTML.gif

does not hold or wait for any resource. Hence the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighm_HTML.gif

-value of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighn_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figho_HTML.gif

is not boosted, that is

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighp_HTML.gif

, and is therefore lower than the precedence (as well as the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighq_HTML.gif

-value) of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighr_HTML.gif

. This means

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighs_HTML.gif

will not run as long as

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fight_HTML.gif

is a live thread. In turn this means

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighu_HTML.gif

cannot take any action in state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighv_HTML.gif

to change its current status; therefore

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighw_HTML.gif

is still detached in state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighx_HTML.gif

. Consequently

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighy_HTML.gif

is also not boosted in state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fighz_HTML.gif

and would not run. This contradicts our assumption. \(\square \)

Proof (of Theorem 1)

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figia_HTML.gif

, then there is nothing to show. So let us assume otherwise. Since the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figib_HTML.gif

is well-founded, we know there exists an ancestor of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figic_HTML.gif

that is the root of the corresponding subtree and therefore is ready (it does not request any resources). Let us call this thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figid_HTML.gif

. Since in PIP the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figie_HTML.gif

-value of any thread equals the maximum precedence of all threads in its

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figif_HTML.gif

-subtree, and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figig_HTML.gif

is in the subtree of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figih_HTML.gif

, the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figii_HTML.gif

-value of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figij_HTML.gif

cannot be lower than the precedence of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figik_HTML.gif

. But, it can also not be higher, because the precedence of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figil_HTML.gif

is the maximum among all threads. Therefore we know that the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figim_HTML.gif

-value of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figin_HTML.gif

is the same as the precedence of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figio_HTML.gif

. The result is that

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figip_HTML.gif

must be running. This is because

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figiq_HTML.gif

-value of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figir_HTML.gif

is the highest of all ready threads. This follows from the fact that the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figis_HTML.gif

-value of any ready thread is the maximum of the precedences of all threads in its subtrees (with

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figit_HTML.gif

having the highest of all threads and being in the subtree of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figiu_HTML.gif

). We also have that

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figiv_HTML.gif

since we assumed

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figiw_HTML.gif

is not running. By Lemma 1 we have that

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figix_HTML.gif

. If

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figiy_HTML.gif

is not detached in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figiz_HTML.gif

, that is either holding or waiting for a resource, it must be that

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figja_HTML.gif

This concludes the Proof of Theorem 1. \(\square \)

4 A Finite Bound on Priority Inversion

Like in the work by Sha et al. our result in Theorem 1 does not yet guarantee the absence of indefinite Priority Inversion. For this we further need the property that every thread gives up its resources after a finite amount of time. We found that this property is not so straightforward to formalise in our model. There are mainly two reasons for this: First, we do not specify what “running the code” of a thread means, for example by giving an operational semantics for machine instructions. Therefore we cannot characterise what are “good” programs that contain for every locking request for a resource also a corresponding unlocking request. Second, we need to distinguish between a thread that “just” locks a resource for a finite amount of time (even if it is very long) and one that locks it forever (there might be an unbounded loop in between the locking and unlocking requests).

Because of these problems, we decided in our earlier paper [31] to leave out this property and let the programmer take on the responsibility to program threads in such a benign manner (in addition to causing no circularity in the RAG). This leave-it-to-the-programmer approach was also taken by Sha et al. in their paper. However, in this paper we can make an improvement by establishing a finite bound on the duration of Priority Inversion measured by the number of events. The events can be seen as a rough(!) abstraction of the “runtime behaviour” of threads and also as an abstract notion of “time”—when a new event happens, some time must have passed.

What we will establish in this section is that there can only be a finite number of states after state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjb_HTML.gif

in which the thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjc_HTML.gif

is blocked (recall for this that a state is a list of events). For this finiteness bound to exist, Sha et al. informally make two assumptions: first, there is a finite pool of threads (active or hibernating) and second, each of these threads will give up its resources after a finite amount of time. However, we do not have this concept of active or hibernating threads in our model. In fact we can dispense with the first assumption altogether and allow that in our model we can create new threads or exit existing threads arbitrarily. Consequently, the absence of indefinite priority inversion we are trying to establish in our model is not true, unless we stipulate an upper bound on the number of threads that have been created during the time leading to any future state after

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjd_HTML.gif

. Otherwise our PIP scheduler could be “swamped” with

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figje_HTML.gif

-requests of lower priority threads. So our first assumption states:

Assumption on the number of threads created after the state : Given the state , in every “future” valid state , we require that the number of created threads is less than a bound , that is
whereby is a list of events.

Note that it is not enough to just state that there are only finite number of threads created up until a single state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjk_HTML.gif

after

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjl_HTML.gif

. Instead, we need to put this bound on the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjm_HTML.gif

events for all valid states after

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjn_HTML.gif

. This ensures that no matter which “future” state is reached, the number of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjo_HTML.gif

-events is finite. This bound

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjp_HTML.gif

is assumed with respect to all future states

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjq_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjr_HTML.gif

, not just a single one.

For our second assumption about giving up resources after a finite amount of “time”, let us introduce the following definition about threads that can potentially block

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjs_HTML.gif

This set contains all threads that are not detached in state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figju_HTML.gif

. According to our definition of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjv_HTML.gif

, this means a thread in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjw_HTML.gif

either holds or waits for some resource in state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjx_HTML.gif

. Our Theorem 1 implies that only these threads can all potentially block

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjy_HTML.gif

after state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figjz_HTML.gif

. We need to make the following assumption about the threads in the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figka_HTML.gif

-set:

Assumptions on the threads : For each such there exists a finite bound such that for all future valid states , we have that if , then

By this assumption we enforce that any thread potentially blocking

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkg_HTML.gif

must become detached (that is it owns no resource anymore) after a finite number of events in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkh_HTML.gif

. Again we have to state this bound to hold in all valid states after

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figki_HTML.gif

. The bound reflects how each thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkj_HTML.gif

is programmed: Though we cannot express what instructions a thread is executing, the events in our model correspond to the system calls made by a thread. Our

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkk_HTML.gif

bounds the number of these “calls”.

The main reason for these two assumptions is that we can prove the following: The number of states after

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkl_HTML.gif

in which the thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkm_HTML.gif

is not running (that is where Priority Inversion occurs) can be bounded by the number of actions the threads in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkn_HTML.gif

perform (i.e. events) and how many threads are newly created. To state our bound formally, we need to make a definition of what we mean by intermediate states between a state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figko_HTML.gif

and a future state after

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkp_HTML.gif

; they will be the list of states starting from

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkq_HTML.gif

up to the state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkr_HTML.gif

. For example, suppose \(\textit{es} = [\textit{e}_n, \textit{e}_{n-1}, \ldots , \textit{e}_2, \textit{e}_1]\), then the intermediate states from

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figks_HTML.gif

upto

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkt_HTML.gif

are

This list of intermediate states can be defined by the following recursive function

Our theorem can then be stated as follows:

Theorem 2

Given our assumptions about bounds, we have that

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Equ5_HTML.png

This theorem uses Isabelle’s list-comprehension notation, which lists all intermediate states between

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkw_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkx_HTML.gif

, and then filters this list according to states in which

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figky_HTML.gif

is not running. By calculating the number of elements in the filtered list using the function

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figkz_HTML.gif

, we have the number of intermediate states in which

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figla_HTML.gif

is not running and which by the theorem is bounded by the term on the right-hand side.

Proof

There are two characterisations for the number of events in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlb_HTML.gif

: First, in each state in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlc_HTML.gif

, clearly either

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figld_HTML.gif

is running or not running. Together with

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figle_HTML.gif

, that implies

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Equ1_HTML.png

(7)

The actions in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlf_HTML.gif

can be partitioned into the actions of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlg_HTML.gif

and the actions of threads other than

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlh_HTML.gif

. The latter can further be divided into actions of existing threads and the actions to create new ones. Moreover, the actions of existing threads other than

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figli_HTML.gif

are by Thm 1 the actions of blockers. This gives rise to

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Equ2_HTML.png

(8)

Furthermore we know that an action of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlj_HTML.gif

in the intermediate states

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlk_HTML.gif

can only be taken when

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figll_HTML.gif

is running. Therefore

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Equ6_HTML.png

holds. Substituting this into (7) gives

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Equ7_HTML.png

into which we can substitute (8) yielding

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Equ8_HTML.png

By our first assumption we know that the number of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlm_HTML.gif

-events are bounded by the bound

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figln_HTML.gif

. By our second assumption we can prove that the actions of all blockers is bounded by the sum of bounds of the individual blocking threads, that is

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Equ9_HTML.png

With this in place we can conclude our theorem. \(\square \)

This theorem is the main conclusion we obtain for the Priority Inheritance Protocol. It is based on the fact that the set of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlo_HTML.gif

is fixed at state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlp_HTML.gif

when

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlq_HTML.gif

becomes the thread with the highest priority. Then no additional blocker of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlr_HTML.gif

can appear after the state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figls_HTML.gif

. And in this way we can bound the number of states where the thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlt_HTML.gif

with the highest priority is prevented from running. Our bound does not depend on the restriction of well-nested critical sections in the Priority Inheritance Protocol as imposed by Sha et al.

5 Properties for an Implementation

While our formalised proof gives us confidence about the correctness of our model of PIP, we found that the formalisation can even help us with efficiently implementing it. For example Baker complained that calculating the current precedence in PIP is quite “heavy weight” in Linux (see the Introduction). In our model of PIP the current precedence of a thread in a state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlu_HTML.gif

depends on the precedences of all threads in its subtree—a “global” transitive notion, which is indeed heavy weight [see the equation for

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlv_HTML.gif

shown in (6)]. We can however improve upon this. For this recall the notion of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlw_HTML.gif

of a thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlx_HTML.gif

defined in (3). There a child is a thread that is only one “hop” away from the thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figly_HTML.gif

in the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figlz_HTML.gif

(and waiting for

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figma_HTML.gif

to release a resource). Using children, we can prove the following lemma for more efficiently calculating

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmb_HTML.gif

of a thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmc_HTML.gif

Lemma 2

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmd_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figme_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmf_HTML.gif

That means the current precedence of a thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmh_HTML.gif

can be computed by considering the static precedence of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmi_HTML.gif

and the current precedences of the children of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmj_HTML.gif

. Their

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmk_HTML.gif

s, in general, need to be computed by recursively descending into deeper “levels” of the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figml_HTML.gif

. However, the current precedence of a thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmm_HTML.gif

, say, only needs to be recomputed when

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmn_HTML.gif

its static precedence is re-set or when

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmo_HTML.gif

one of its children changes its current precedence or when

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmp_HTML.gif

the children set changes (for example in a

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmq_HTML.gif

-event). If only the static precedence or the children-set changes, then we can avoid the recursion and compute the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmr_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figms_HTML.gif

locally. In such cases the recursion does not need to descend into the corresponding subtree. Once the current precedence is computed in this more efficient manner, the selection of the thread with highest precedence from a set of ready threads is a standard scheduling operation and implemented in most operating systems.

Below we outline how our formalisation guides the efficient calculation of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmt_HTML.gif

in response to each kind of events.

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmu_HTML.gif

We assume that the current state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmv_HTML.gif

and the next state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmw_HTML.gif

, whereby

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmx_HTML.gif

, are both valid (meaning the event

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmy_HTML.gif

is allowed to occur in

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figmz_HTML.gif

). In this situation we can show that

This means in an implementation we do not have to recalculate the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignb_HTML.gif

and also none of the current precedences of the other threads. The current precedence of the created thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignc_HTML.gif

is just its precedence, namely the pair

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignd_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figne_HTML.gif

We again assume that the current state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignf_HTML.gif

and the next state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figng_HTML.gif

, whereby this time

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignh_HTML.gif

, are both valid. We can show that

This means again we do not have to recalculate the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignj_HTML.gif

and also not the current precedences for the other threads. Since

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignk_HTML.gif

is not alive anymore in state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignl_HTML.gif

, there is no need to calculate its current precedence.

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignm_HTML.gif

We assume that

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignn_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figno_HTML.gif

with

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignp_HTML.gif

are both valid. We can show that

The first property is again telling us we do not need to change the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignr_HTML.gif

. The second shows that the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figns_HTML.gif

-values of all threads other than

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignt_HTML.gif

are unchanged. The reason for this is more subtle: Since

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignu_HTML.gif

must be running, then it does not wait for any resource to be released and it cannot be in any subtree of any other thread. So all current precedences of other threads are unchanged.

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignv_HTML.gif

We assume that

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignw_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignx_HTML.gif

with

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figny_HTML.gif

being

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Fignz_HTML.gif

are both valid. We have to consider two subcases: one where there is a thread to “take over” the released resource

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figoa_HTML.gif

, and one where there is not. Let us consider them in turn. Suppose in state

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figob_HTML.gif

, the thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figoc_HTML.gif

takes over resource

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figod_HTML.gif

from thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figoe_HTML.gif

. We can prove

which shows how the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figog_HTML.gif

needs to be changed. The next lemmas suggest how the current precedences need to be recalculated. For threads that are not

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figoh_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figoi_HTML.gif

nothing needs to be changed, since we can show

For

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figok_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figol_HTML.gif

we need to use Lemma 2 to recalculate their current precedence since their children have changed. However, neither

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figom_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figon_HTML.gif

is element of the respective children, which is shown by the following two facts:

This means the recalculation of the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figop_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figoq_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figor_HTML.gif

can be done independently and also done locally by only looking at the children: according to (9) and (10) none of the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figos_HTML.gif

of the children changes, just the children-sets changes by a

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figot_HTML.gif

-event.

In the other case where there is no thread that takes over

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figou_HTML.gif

, we can prove that the updated

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figov_HTML.gif

merely deletes the relevant edge and that no current precedence needs to be recalculated for any thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figow_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figoy_HTML.gif

We assume that

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figoz_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpa_HTML.gif

with

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpb_HTML.gif

are both valid. We again have to analyse two subcases, namely the one where

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpc_HTML.gif

is not locked, and one where it is. We treat the former case first by showing that

This means we need to add a holding edge to the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpe_HTML.gif

. However, note that while the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpf_HTML.gif

changes the corresponding

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpg_HTML.gif

does not change. Together with the fact that the precedences of all threads are unchanged, no

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figph_HTML.gif

value is changed. Therefore, no recalculation of the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpi_HTML.gif

value of any thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpj_HTML.gif

is needed.

In the second case we know that resource

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpk_HTML.gif

is locked. We can show that

That means we have to add a waiting edge to the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpm_HTML.gif

. Furthermore the current precedence for all threads that are not ancestors of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpn_HTML.gif

(in the new

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpo_HTML.gif

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpp_HTML.gif

) are unchanged. For the ancestors of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpq_HTML.gif

we need to follow the edges in the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpr_HTML.gif

and recompute the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figps_HTML.gif

. Whereas in all other event we might have to make modifications to the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpt_HTML.gif

, no recalculation of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpu_HTML.gif

depends on the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpv_HTML.gif

. This is the only case where the recalculation needs to take the connections in the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpw_HTML.gif

into account. To do this we can start from

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpx_HTML.gif

and follow the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpy_HTML.gif

-edges to recompute the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figpz_HTML.gif

of every thread encountered on the way using Lemma 2. This means the recomputation can be done locally (level-by-level) in a bottom-up fashion. Since the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqa_HTML.gif

, and thus

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqb_HTML.gif

, are loop free, this procedure will always stop. The following lemma shows, however, that this procedure can actually stop often earlier without having to consider all ancestors.

This property states that if an intermediate

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqd_HTML.gif

-value does not change (in this case the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqe_HTML.gif

-value of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqf_HTML.gif

), then the procedure can also stop, because none of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqg_HTML.gif

ancestor-threads will have their current precedence changed.

As can be seen, a pleasing byproduct of our formalisation is that the properties in this section closely inform an implementation of PIP, namely whether the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqh_HTML.gif

needs to be reconfigured or current precedences need to be recalculated for an event. This information is provided by the lemmas we proved. We confirmed that our observations translate into practice by implementing our version of PIP on top of PINTOS, a small operating system written in C and used for teaching at Stanford University [19].⁵ While there is no formal connection between our formalisation and the C-code shown below, the results of the formalisation clearly shine through in the design of the code.

To implement PIP in PINTOS, we only need to modify the kernel functions corresponding to the events in our formal model. The events translate to the following function interface in PINTOS:

Event	PINTOS function

Our implicit assumption that every event is an atomic operation is ensured by the architecture of PINTOS (which allows disabling of interrupts when some operations are performed). The case where an unlocked resource is given next to the waiting thread with the highest precedence is realised in our implementation by priority queues. We implemented them as Braun trees [17], which provide efficient

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqs_HTML.gif

-operations for accessing and updating. In the code we shall describe below, we use the function

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqt_HTML.gif

, for inserting a new element into a priority queue, and the function

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqu_HTML.gif

, for updating the position of an element that is already in a queue. Both functions take an extra argument that specifies the comparison function used for organising the priority queue.

Apart from having to implement relatively complex datastructures in C using pointers, our experience with the implementation has been very positive: our specification and formalisation of PIP translates smoothly to an efficient implementation in PINTOS. Let us illustrate this with the C-code for the function

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqv_HTML.gif

, shown in Fig. 3. This function implements the operation of requesting and, if free, locking of a resource by the current running thread. The convention in the PINTOS code is to use the terminology locks rather than resources. A lock is represented as a pointer to the structure lock (Line 1). Lines 2–4 are taken from the original code of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqw_HTML.gif

in PINTOS. They contain diagnostic code: first, there is a check that the lock is a “valid” lock by testing whether it is not NULL; second, a check that the code is not called as part of an interrupt—acquiring a lock should only be initiated by a request from a (user) thread, not from an interrupt; third, it is ensured that the current thread does not ask twice for a lock. These assertions are supposed to be satisfied because of the assumptions in PINTOS about how this code is called. If not, then the assertions indicate a bug in PINTOS and the result will be a “kernel panic”.

Lines 6 and 7 of lock_acquire make the operation of acquiring a lock atomic by disabling all interrupts, but saving them for resumption at the end of the function (Line 31). In Line 8, the interesting code with respect to scheduling starts: we first check whether the lock is already taken (its value is then 0 indicating “already taken”, or 1 for being “free”). In case the lock is taken, we enter the if-branch inserting the current thread into the waiting queue of this lock (Line 9). The waiting queue is referenced in the usual C-way as

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqy_HTML.gif

. Next, we record that the current thread is waiting for the lock (Line 10). Thus we established two pointers: one in the waiting queue of the lock pointing to the current thread, and the other from the current thread pointing to the lock. According to our specification in Sect. 2 and the properties we were able to prove for

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figqz_HTML.gif

, we need to “chase” all the ancestor threads in the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figra_HTML.gif

and update their current precedence; however we only have to do this as long as there is change in the current precedence.

The “chase” is implemented in the while-loop in Lines 13–24. To initialise the loop, we assign in Lines 11 and 12 the variable

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figrb_HTML.gif

to the owner of the lock. Inside the loop, we first update the precedence of the lock held by

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figrc_HTML.gif

(Line 14). Next, we check whether there is a change in the current precedence of

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figrd_HTML.gif

. If not, then we leave the loop, since nothing else needs to be updated (Lines 15 and 16). If there is a change, then we have to continue our “chase”. We check what lock the thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figre_HTML.gif

is waiting for (Lines 17 and 18). If there is none, then the thread

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figrf_HTML.gif

is ready (the “chase” is finished with finding a root in the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figrg_HTML.gif

). In this case we update the ready-queue accordingly (Lines 19 and 20). If there is a lock

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figrh_HTML.gif

is waiting for, we update the waiting queue for this lock and we continue the loop with the holder of that lock (Lines 22 and 23). After all current precedences have been updated, we finally need to block the current thread, because the lock it asked for was taken (Line 25).

If the lock the current thread asked for is not taken, we proceed with the else-branch (Lines 26–30). We first decrease the value of the lock to 0, meaning it is taken now (Line 27). Second, we update the reference of the holder of the lock (Line 28), and finally update the queue of locks the current thread already possesses (Line 29). The very last step is to enable interrupts again thus leaving the protected section.

Similar operations need to be implemented for the

https://static-content.springer.com/image/art%3A10.1007%2Fs10817-019-09511-5/MediaObjects/10817_2019_9511_Figri_HTML.gif

function, which we however do not show. The reader should note though that we did not verify our C-code. This is in contrast, for example, to the work on seL4, which actually verified in Isabelle/HOL that their C-code satisfies its specification, though this specification does not contain anything about PIP [11]. Our verification of PIP however provided us with (formally proven) insights on how to design the C-code. It gave us confidence that leaving the “chase” early, whenever there is no change in the calculated current precedence, does not break the correctness of the algorithm.

6 Conclusion

The Priority Inheritance Protocol (PIP) is a classic textbook algorithm used in many real-time operating systems in order to avoid the problem of Priority Inversion. Although classic and widely used, PIP does have its faults: for example it does not prevent deadlocks in cases where threads have circular lock dependencies.

We had two goals in mind with our formalisation of PIP: One is to make the notions in the correctness proof by Sha et al. [24] precise so that they can be processed by a theorem prover. The reason is that a mechanically checked proof avoids the flaws that crept into their informal reasoning. We achieved this goal: The correctness of PIP now only hinges on the assumptions behind our formal model. The reasoning, which is sometimes quite intricate and tedious, has been checked by Isabelle/HOL. We can also confirm that Paulson’s inductive method for protocol verification [18] is quite suitable for our formal model and proof. The traditional application area of this method is security protocols.

The second goal of our formalisation is to provide a specification for actually implementing PIP. Textbooks, for example Vahalia [26, Section 5.6.5], explain how to use various implementations of PIP and abstractly discuss their properties, but surprisingly lack most details important for a programmer who wants to implement PIP (similarly Sha et al. [24]). That this is an issue in practice is illustrated by the email from Baker we cited in the Introduction. We achieved also this goal: The formalisation allowed us to efficiently implement our version of PIP on top of PINTOS, a simple instructional operating system for the x86 architecture implemented by Pfaff [19]. It also gives the first author enough data to enable his undergraduate students to implement PIP (as part of their OS course). A byproduct of our formalisation effort is that nearly all design choices for the implementation of PIP scheduler are backed up with a proved lemma. We were also able to establish the property that the choice of the next thread which takes over a lock is irrelevant for the correctness of PIP. Moreover, we eliminated a crucial restriction present in the proof of Sha et al.: they require that critical sections nest properly, whereas our scheduler allows critical sections to overlap. What we are not able to do is to mechanically “synthesise” an actual implementation from our formalisation. To do so for C-code seems quite hard and is beyond current technology available for Isabelle. Also our proof-method based on events is not “computational” in the sense of having a concrete algorithm behind it: our formalisation is really more about the specification of PIP and ensuring that it has the desired properties (the informal specification by Sha et al. did not).

PIP is a scheduling algorithm for single-processor systems. We are now living in a multi-processor world. Priority Inversion certainly occurs also there, see for example work by Brandenburg, and Davis and Burns [1, 6]. However, there is very little “foundational” work about PIP-algorithms on multi-processor systems. We are not aware of any correctness proofs, not even informal ones. There is an implementation of a PIP-algorithm for multi-processors as part of the “real-time” effort in Linux, including an informal description of the implemented scheduling algorithm given by Rostedt in [23]. We estimate that the formal verification of this algorithm, involving more fine-grained events, is a magnitude harder than the one we presented here, but still within reach of current theorem proving technology. We leave this for future work.

To us, it seems sound reasoning about scheduling algorithms is fiendishly difficult if done informally by “pencil-and-paper”. We infer this from the flawed proof in the paper by Sha et al. [24] and also from [22] where Regehr points out an error in a paper about Preemption Threshold Scheduling by Wang and Saksena [28]. The use of a theorem prover was invaluable to us in order to be confident about the correctness of our reasoning (for example no corner case can be overlooked). The most closely related work to ours is the formal verification in PVS of the Priority Ceiling Protocol done by Dutertre [7]—another solution to the Priority Inversion problem, which however needs static analysis of programs in order to avoid it. There have been earlier formal investigations into PIP [8, 10, 29], but they employ model checking techniques. The results obtained by them apply, however, only to systems with a fixed size, such as a fixed number of events and threads. In contrast, our result applies to systems of arbitrary size. Moreover, our result is a good witness for one of the major reasons to be interested in machine checked reasoning: gaining deeper understanding of the subject matter.

Our formalisation consists of around 600 lemmas and overall 9200 lines of readable and commented Isabelle/Isar code with a few apply-scripts interspersed. The formal model of PIP is 310 lines long; our graph theory implementation using relations is 1615 lines; the basic properties of PIP take around 5000 lines of code; and the formal correctness proof 1250 lines.

The properties relevant for an implementation require 1000 lines. The code of our formalisation can be downloaded from the Mercurial repository at http://talisker.inf.kcl.ac.uk/cgi-bin/repos.cgi/pip.

Acknowledgements

We are grateful for the comments we received from anonymous referees. We are also deeply saddened about the tragic death of our co-author, colleague and friend, Chunhan, who suddenly died on 22 December 2016. He drove very much forward this work and extended it in his PhD-thesis with a formal verification of a SELinux-style access control system. He was a stellar student and very promising young researcher in the field of interactive theorem proving. He was liked by many and indispensable for organising the ITP’15 conference in Nanjing. Chunhan left behind a grieving wife and 8-year-old son.

OpenAccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

previous article Automating Free Logic in HOL, with an Experimental Application in Category Theory

next article Politeness and Combination Methods for Theories with Bridging Functions

Sha et al. call it the Basic Priority Inheritance Protocol [24] and others sometimes also call it Priority Boosting, Priority Donation or Priority Lending.

We shall come back later to the case of PIP on multi-processor systems.

For well-founded we use the quite natural definition from Isabelle/HOL.

This situation is similar to the infamous occurs check in Prolog: In order to say anything meaningful about unification, one needs to perform an occurs check. But in practice the occurs check is omitted and the responsibility for avoiding problems rests with the programmer.

An alternative would have been the small Xv6 operating system used for teaching at MIT [4, 5]. However this operating system implements a simple round robin scheduler that lacks stubs for dealing with priorities. This is inconvenient for our purposes.

Brandenburg, B.B.: Scheduling and Locking in Multiprocessor Real-Time Operating Systems. PhD thesis, The University of North Carolina at Chapel Hill (2011)

Budin, L., Jelenkovic, L.: Time-constrained programming in windows NT environment. In: Proceedings of the IEEE International Symposium on Industrial Electronics (ISIE), vol. 1, pp. 90–94 (1999)

Buttazzo, G.C.: Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications, 3rd edn. Springer, Berlin (2011)CrossRef

Cox, R., Kaashoek, F., Morris, R.: Xv6. http://pdos.csail.mit.edu/6.828/2012/xv6.html

Cox, R., Kaashoek, F., Morris, R.: Xv6: A Simple. Unix-like Teaching Operating System. Technical report, MIT (2012)

Davis, R.I., Burns, A.: A survey of hard real-time scheduling for multiprocessor systems. ACM Comput. Surv. 43(4), 35:1–35:44 (2011)CrossRef

Dutertre, B.: The priority ceiling protocol: formalization and analysis using PVS. In: Proceedings of the 21st IEEE Conference on Real-Time Systems Symposium (RTSS), pp. 151–160. IEEE Computer Society (2000)

Faria, J. M. S.: Formal Development of Solutions for Real-Time Operating Systems with TLA+/TLC. PhD thesis, University of Porto (2008)

Haftmann, F., Wenzel, M.: Local theory specifications in Isabelle/Isar. In: Proceedings of the International Conference on Types, Proofs and Programs (TYPES), vol. 5497 of LNCS, pp. 153–168 (2008)MATH

10.

Jahier, E., Halbwachs, B., Raymond, P.: Synchronous modeling and validation of priority inheritance schedulers. In: Proceedings of the 12th International Conference on Fundamental Approaches to Software Engineering (FASE), vol. 5503 of LNCS, pp. 140–154, (2009)CrossRef

11.

Klein, G., Andronick, J., Elphinstone, K., Heiser, G., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Winwood, S.: seL4: formal verification of an OS kernel. Commun. ACM 53(6), 107–115 (2010)CrossRef

12.

Lampson, B.W., Redell, D.D.: Experiences with processes and monitors in mesa. Commun. ACM 23(2), 105–117 (1980)CrossRef

13.

Laplante, P.A., Ovaska, S.J.: Real-Time Systems Design and Analysis: Tools for the Practitioner, 4th edn. Wiley, Hoboken (2011)CrossRef

14.

Li, Q., Yao, C.: Real-Time Concepts for Embedded Systems. CRC Press, Boca Raton (2003)CrossRef

15.

Liu, J.W.S.: Real-Time Systems. Prentice Hall, Upper Saddle River (2000)

16.

Moylan, P.J., Betz, R.E., Middleton, R.H.: The Priority Disinheritance Problem. Technical Report EE9345, University of Newcastle (1993)

17.

Paulson, L.C.: ML for the Working Programmer. Cambridge University Press, Cambridge (1996)CrossRef

18.

Paulson, L.C.: The inductive approach to verifying cryptographic protocols. J. Comput. Secur. 6(1–2), 85–128 (1998)CrossRef

19.

Pfaff, B.: PINTOS. http://www.stanford.edu/class/cs140/projects/

20.

Rajkumar, R.: Synchronization in Real-Time Systems: A Priority Inheritance Approach. Kluwer, Dordrecht (1991)CrossRef

21.

Reeves, G.E.: Re: What Really Happened on Mars? Risks Forum 19(54) (1998)

22.

Regehr, J.: Scheduling tasks with mixed preemption relations for robustness to timing faults. In: Proceedings of the 23rd IEEE Real-Time Systems Symposium (RTSS), pp. 315–326 (2002)

23.

Rostedt, S.: RT-Mutex Implementation Design. Linux Kernel Distribution at, www.kernel.org/doc/Documentation/rt-mutex-design.txt

24.

Sha, L., Rajkumar, R., Lehoczky, J.P.: Priority inheritance protocols: an approach to real-time synchronization. IEEE Trans. Comput. 39(9), 1175–1185 (1990)MathSciNetCrossRef

25.

Silberschatz, A., Galvin, P.B., Gagne, G.: Operating System Concepts, 9th edn. Wiley, Hoboken (2013)MATH

26.

Vahalia, U.: UNIX Internals: The New Frontiers. Prentice-Hall, Upper Saddle River (1996)MATH

27.

Wang, J., Yang, H., Zhang, X.: Liveness reasoning with Isabelle/HOL. In: Proceedings of the 22nd International Conference on Theorem Proving in Higher Order Logics (TPHOLs), volume 5674 of LNCS, pp. 485–499 (2009)

28.

Wang, Y., Saksena, M.: Scheduling fixed-priority tasks with preemption threshold. In: Proceedings of the 6th Workshop on Real-Time Computing Systems and Applications (RTCSA), pp. 328–337 (1999)

29.

Wellings, A., Burns, A., Santos, O.M., Brosgol, B.M.: Integrating priority inheritance algorithms in the real-time specification for java. In: Proceedings of the 10th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC), pp. 115–123. IEEE Computer Society (2007)

30.

Yodaiken, V.: Against Priority Inheritance. Technical report, Finite State Machine Labs (FSMLabs) (2004)

31.

Zhang, X., Urban, C., Wu, C.: Priority inheritance protocol proved correct. In: Proceedings of the 3rd Conference on Interactive Theorem Proving (ITP), vol. 7406 of LNCS, pp. 217–232 (2012)CrossRef

Title: Priority Inheritance Protocol Proved Correct
Authors: Xingyuan Zhang
Christian Urban
Chunhan Wu
Publication date: 12-02-2019
Publisher: Springer Netherlands
Published in: Journal of Automated Reasoning / Issue 1/2020
Print ISSN: 0168-7433
Electronic ISSN: 1573-0670
DOI: https://doi.org/10.1007/s10817-019-09511-5

Springer Professional

Priority Inheritance Protocol Proved Correct

Abstract

Publisher's Note

1 Introduction

2 Formal Model of the Priority Inheritance Protocol

3 The Correctness Proof

4 A Finite Bound on Priority Inversion

5 Properties for an Implementation

6 Conclusion

Acknowledgements

Publisher's Note

Premium Partner

Springer Professional

Abstract

Publisher's Note

1 Introduction

2 Formal Model of the Priority Inheritance Protocol

3 The Correctness Proof

4 A Finite Bound on Priority Inversion

5 Properties for an Implementation

6 Conclusion

Acknowledgements

Publisher's Note

Other articles of this Issue 1/2020

Politeness and Combination Methods for Theories with Bridging Functions

A Prover Dealing with Nominals, Binders, Transitivity and Relation Hierarchies

A Conflict-Driven Solving Procedure for Poly-Power Constraints

ExpTime Tableaux with Global Caching for Hybrid PDL

Automating Free Logic in HOL, with an Experimental Application in Category Theory

Premium Partner