Skip to main content
Erschienen in: Journal of Automated Reasoning 7/2020

Open Access 06.06.2020

Proof-Producing Synthesis of CakeML from Monadic HOL Functions

verfasst von: Oskar Abrahamsson, Son Ho, Hrutvik Kanabar, Ramana Kumar, Magnus O. Myreen, Michael Norrish, Yong Kiam Tan

Erschienen in: Journal of Automated Reasoning | Ausgabe 7/2020

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We introduce an automatic method for producing stateful ML programs together with proofs of correctness from monadic functions in HOL. Our mechanism supports references, exceptions, and I/O operations, and can generate functions manipulating local state, which can then be encapsulated for use in a pure context. We apply this approach to several non-trivial examples, including the instruction encoder and register allocator of the otherwise pure CakeML compiler, which now benefits from better runtime performance. This development has been carried out in the HOL4 theorem prover.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

This paper is about bridging the gap between programs verified in logic and verified implementations of those programs in a programming language (and ultimately machine code). As a toy example, consider computing the nth Fibonacci number. The following is a recursion equation for a function, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figa_HTML.gif , in higher-order logic (HOL) that does the job:
A hand-written implementation (shown here in CakeML [10], which has similar syntax and semantics to Standard ML) would look something like this:
In moving from mathematics to a real implementation, some issues are apparent:
(i)
We use a tail-recursive linear-time algorithm, rather than the exponential-time recursion equation.
 
(ii)
The whole program is not a pure function: it does I/O, reading its argument from the command line and printing the answer to standard output.
 
(iii)
We use exception handling to deal with malformed inputs (if the arguments do not start with a string representing a natural number, hd or s2n may raise an exception).
 
The first of these issues (i) can easily be handled in the realm of logical functions. We define a tail-recursive version in logic:
then produce a correctness theorem, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fige_HTML.gif , with a simple inductive proof (a 5-line tactic proof in HOL4, not shown).
Now, because https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figf_HTML.gif is a logical function with an obvious computational counterpart, we can use proof-producing synthesis techniques [14] to automatically synthesise code verified to compute it. We thereby produce something like the first line of the CakeML code above, along with a theorem relating the semantics of the synthesised code back to the function in logic.
But when it comes to handling the other two issues, (ii) and (iii), and producing and verifying the remaining three lines of CakeML code, our options are less straightforward. The first issue was easy because we were working with a shallow embedding, where one writes the program as a function in logic and proves properties about that function directly. Shallow embeddings rely on an analogy between mathematical functions and procedures in a pure functional programming language. However, effects like state, I/O, and exceptions, can stretch this analogy too far. The alternative is a deep embedding: one writes the program as an input to a formal semantics, which can accurately model computational effects, and proves properties about its execution under those semantics.
Proofs about shallow embeddings are relatively easy since they are in the native language of the theorem prover, whereas proofs about deep embeddings are filled with tedious details because of the indirection through an explicit semantics. Still, the explicit semantics make deep embeddings more realistic. An intermediate option that is suitable for the effects we are interested in—state/references, exceptions, and I/O—is to use monadic functions: one writes (shallow) functions that represent computations, aided by a composition operator (monadic bind) for stitching together effects. The monadic approach to writing effectful code in a pure language may be familiar from the Haskell language which made it popular.
For our nth Fibonacci example, we can model the effects of the whole program with a monadic function, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figg_HTML.gif , that calls the pure function https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figh_HTML.gif to do the calculation. Figure 1 shows how https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figi_HTML.gif can be written using https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_IEq1_HTML.gif -notation familiar from Haskell. This is as close as we can get to capturing the effectful behaviour of the desired CakeML program while remaining in a shallow embedding. Now how can we produce real code along with a proof that it has the correct semantics? If we use the proof-producing synthesis techniques mentioned above [14], we produce pure CakeML code that exposes the monadic plumbing in an explicit state-passing style. But we would prefer verified effectful code that uses native features of the target language (CakeML) to implement the monadic effects.
In this paper, we present an automated technique for producing verified effectful code that handles I/O, exceptions, and other issues arising in the move from mathematics to real implementations. Our technique systematically establishes a connection between shallowly embedded functions in HOL with monadic effects and deeply embedded programs in the impure functional language CakeML. The synthesised code is efficient insofar as it uses the native effects of the target language and is close to what a real implementer would write. For example, given the monadic https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figk_HTML.gif function above, our technique produces essentially the same CakeML program as on the first page (but with a let for every monad bind), together with a proof that the synthesised program is a refinement.
Contributions. Our technique for producing verified effectful code from monadic functions builds on a previous limited approach [14]. The new generalised method adds support for the following features:
  • global references and exceptions (as before, but generalised),
  • mutable arrays (both fixed and variable size),
  • input/output (I/O) effects,
  • local mutable arrays and references, which can be integrated seamlessly with code synthesis for otherwise pure functions,
  • composable effects, whereby different state and exception monads can be combined using a lifting operator, and,
  • support for recursive programs where termination depends on monadic state.
As a result, we can now write whole programs as shallow embeddings and obtain real verified code via synthesis. Prior to this work, whole program verification in CakeML involved manual deep embedding proofs for (at the very least) the I/O wrapper. To exercise our toolchain, we apply it to several examples:
  • the nth Fibonacci example already seen (exceptions, I/O)
  • the Floyd Warshall algorithm for finding shortest paths (arrays)
  • an in-place quicksort algorithm (polymorphic local arrays, exceptions)
  • the instruction encoder in the CakeML compiler’s assembler (local arrays)
  • the CakeML compiler’s register allocator (local refs, arrays)
  • the Candle theorem prover’s kernel [9] (global refs, exceptions)
  • an OpenTheory [8] article checker (global refs, exceptions, I/O)
In Sect. 6, we compare runtimes with the previous non-stateful versions of CakeML’s register allocator and instruction encoder; and for the OpenTheory reader we compare the amount of code/proof required before and after using our technique.
Additions. This paper is an extended version of our earlier conference paper [6]. The following contributions are new to this work: a brief discussion of how polymorphic functions that use type variables in their local state can be synthesized (Sect. 4), a section on synthesis of recursive programs where termination depends on the monadic state (Sect. 5), and new case studies using our tool, e.g., quicksort with polymorphic local arrays (Sect. 4), and the CakeML compiler’s instruction encoder (Sect. 6).

2 High-Level Ideas

This paper combines the following three concepts in order to deliver the contributions listed above. The main ideas will be described briefly in this section, while subsequent sections will provide details. The three concepts are:
(i)
synthesis of stateful ML code as described in our previous work [14],
 
(ii)
separation logic [16] as used by characteristic formulae for CakeML [5], and
 
(iii)
a new abstract synthesis mode for the CakeML synthesis tools [14].
 
Our previous work on proof-producing synthesis of stateful ML (i) was severely limited by the requirement to have a hard-coded invariant on the program’s state. There was no support for I/O and all references had to be declared globally. At the time of its development, we did not have a satisfactory way of generalising the hard-coded state invariant.
In this paper we show (in Sect. 3) that the separation logic of CF (2) can be used to neatly generalise the hard-coded state invariant of our prior work (i). CF-style separation logic easily supports references and arrays, including resizable arrays, and, supports I/O too because it allows us to treat I/O components as if they are heap components. Furthermore, by carefully designing the integration of (i) and (2), we retain the frame rule from the separation logic. In the context of code synthesis, this frame rule allows us to implement a lifting feature for changing the type of the state-and-exception monads. Being able to change types in the monads allows us to develop reusable libraries—e.g. verified file I/O functions—that users can lift into the monad that is appropriate for their application.
The combination of (i) and (2) does not by itself support synthesis of code with local state due to inherited limitations of (i), wherein the generated code must be produced as a concrete list of global declarations. For example, if monadic functions, say https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figl_HTML.gif and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figm_HTML.gif , refer to a common reference, say r, then r must be defined globally:
In this paper (in Sect. 4), we introduce a new abstract synthesis mode (3) which removes the requirement of generating code that only consists of a list of global declarations, and, as a result, we are now able to synthesise code such as the following, where the reference r is a local variable:
In the input to the synthesis tool, this declaration and initialisation of local state corresponds to applying the state-and-exception monad. Expressions that fully apply the state-and-exception monad can subsequently be used in the synthesis of pure CakeML code: the monadic synthesis tool can prove a pure specification for such expressions, thereby encapsulating the monadic features.

3 Generalised Approach to Synthesis of Stateful ML Code

This section describes how our previous approach to proof-producing synthesis of stateful ML code [14] has been generalised. In particular, we explain how the separation logic from our previous work on characteristic formulae [5] has been used for the generalisation (Sect. 3.3); and how this new approach adds support for user-defined references, fixed- and variable-length arrays, I/O functions (Sect. 3.4), and a handy feature for reusing state-and-exception monads (Sect. 3.5).
In order to make this paper as self-contained as possible, we start with a brief look at how the semantics of CakeML is defined (Sect. 3.1) and how our previous work on synthesis of pure CakeML code works (Sect. 3.2), since the new synthesis method for stateful code is an evolution of the original approach for pure code.

3.1 Preliminaries: CakeML Semantics

The semantics of the CakeML language is defined in the functional big-step style [15], which means that the semantics is an interpreter defined as a functional program in the logic of a theorem prover.
The definition of the semantics is layered. At the top-level the https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figp_HTML.gif function defines what the observable I/O events are for a given whole program. However, more relevant to the presentation in this paper is the next layer down: a function called https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figq_HTML.gif that describes exactly how expressions evaluate. The type of the https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figr_HTML.gif function is shown below. This function takes as arguments a state (with a type variable for the I/O environment), a value environment, and a list of expressions to evaluate. It returns a new state and a value result.
The semantics state is defined as the record type below. The fields relevant for this presentation are: https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figt_HTML.gif , https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figu_HTML.gif and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figv_HTML.gif . The https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figw_HTML.gif field is a list of store values that acts as a mapping from reference names (list index) to reference and array values (list element). The https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figx_HTML.gif is a logical clock for the functional big-step style. The clock allows us to prove termination of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figy_HTML.gif and is, at the same time, used for reasoning about divergence. Lastly, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figz_HTML.gif is the parametrised oracle model of the foreign function interface, i.e. I/O environment.
A call to the function https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figab_HTML.gif returns one of two results: https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figac_HTML.gif for successfully terminating computations, and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figad_HTML.gif for stuck computations.
Successful computations, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figae_HTML.gif , return a list https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figaf_HTML.gif of CakeML values. CakeML values are modelled in the semantics using a datatype called https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figag_HTML.gif . This datatype includes (among other things) constructors for (mutually recursive) closures ( https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figah_HTML.gif and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figai_HTML.gif ), datatype constructor values ( https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figaj_HTML.gif ), and literal values ( https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figak_HTML.gif ) such as integers, strings, characters etc. These will be explained when needed in the rest of the paper.
Stuck computations, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figal_HTML.gif , carry an error value https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figam_HTML.gif that is one of the following. For this paper, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figan_HTML.gif is the most relevant case.
  • https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figao_HTML.gif indicates that evaluation results in an uncaught exception https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figap_HTML.gif . These exceptions can be caught with a handle in CakeML.
  • https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figaq_HTML.gif indicates that evaluation of the expression consumes all of the logical clock. Programs that hit this error for all initial values of the clock are considered diverging.
  • https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figar_HTML.gif , for other kinds of errors, e.g. when evaluating ill-typed expressions, or attempting to access unbound variables.

3.2 Preliminaries: Synthesis of Pure ML Code

Our previous work [14] describes a proof-producing algorithm for synthesising CakeML functions from functions in higher-order logic. Here proof-producing means that each execution proves a theorem (called a certificate theorem) guaranteeing correctness of that execution of the algorithm. In our setting, these theorems relate the CakeML semantics of the synthesised code with the given HOL function.
The whole approach is centred around a systematic way of proving theorems relating HOL functions (i.e. HOL terms) with CakeML expressions. In order for us to state relations between HOL terms and CakeML expressions, we need a way to state relations between HOL terms and CakeML values. For this we use relations ( https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figas_HTML.gif , https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figat_HTML.gif  \(\cdot \), \(\cdot \longrightarrow \cdot \), etc.) which we call refinement invariants. The definition of the simple https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figau_HTML.gif refinement invariant is shown below: https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figav_HTML.gif is true if CakeML value https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figaw_HTML.gif of type https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figax_HTML.gif represents the HOL integer https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figay_HTML.gif of type https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figaz_HTML.gif .
Most refinement invariants are more complicated, e.g. https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbb_HTML.gif states that CakeML value https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbc_HTML.gif represents lists of int lists https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbd_HTML.gif of HOL type https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbe_HTML.gif .
We now turn to CakeML expressions: we define a predicate called https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbf_HTML.gif in order to conveniently state relationships between HOL terms and CakeML expressions. The intuition is that https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbg_HTML.gif is true if https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbh_HTML.gif evaluates (in environment https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbi_HTML.gif ) to some result https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbj_HTML.gif (of HOL type https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbk_HTML.gif ) such that https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbl_HTML.gif holds for https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbm_HTML.gif , i.e. https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbn_HTML.gif . The formal definition below is cluttered by details regarding the clock and references: there must be a large enough clock and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbo_HTML.gif may allocate new references, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbp_HTML.gif , but must not modify any existing references, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbq_HTML.gif . We express this restriction on the references using list append https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbr_HTML.gif . Note that any list index that can be looked up in https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbs_HTML.gif has the same look up in https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbt_HTML.gif .
The use of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbv_HTML.gif and the main idea behind the synthesis algorithm is most conveniently described using an example. The example we consider here is the following HOL function:
The main part of the synthesis algorithm proceeds as a syntactic bottom-up pass over the given HOL term. In this case, the bottom-up pass traverses HOL term https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbx_HTML.gif . The result of each stage of the pass is a theorem stated in terms of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figby_HTML.gif in the format shown below. Such theorems state a connection between a HOL term https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figbz_HTML.gif and some generated https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figca_HTML.gif w.r.t. a refinement invariant https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcb_HTML.gif that is appropriate for the type of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcc_HTML.gif .
For our little example, the algorithm derives the following theorems for the subterms https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figce_HTML.gif and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcf_HTML.gif , which are the leaves of the HOL term. Here and elsewhere in this paper, we display CakeML abstract syntax as concrete syntax inside \(\lfloor {}\cdots \rfloor {}\), i.e. \(\lfloor {}\texttt {1}\rfloor {}\) is actually the CakeML expression https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcg_HTML.gif in the theorem prover HOL4; similarly \(\lfloor {}\texttt {x}\rfloor {}\) is actually displayed as https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figch_HTML.gif in HOL4. Note that both theorems below are of the required general format.
https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Equ1_HTML.png
(1)
The algorithm uses theorems (1) when proving a theorem for the compound expression https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figci_HTML.gif . The process is aided by an auxiliary lemma for integer addition, shown below. The synthesis algorithm is supported by several such pre-proved lemmas for various common operations.
By choosing the right specialisations for the variables, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figck_HTML.gif , https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcl_HTML.gif , https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcm_HTML.gif , https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcn_HTML.gif , the algorithm derives the following theorem for the body of the running example. Here the assumption on evaluation of \(\lfloor {}\texttt {x}\rfloor {}\) was inherited from (1).
https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Equ2_HTML.png
(2)
Next, the algorithm needs to introduce the \(\lambda \)-binder in https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figco_HTML.gif . This can be done by instantiation of the following pre-proved lemma. Note that the lemma below introduces a refinement invariant for function types, \(\longrightarrow \), which combines refinement invariants for the input and output types of the function [14].
An appropriate instantiation and combination with (2) produces the following:
which, after only minor reformulation, becomes a certificate theorem for the given HOL function https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcr_HTML.gif :
Additional notes. The main part of the synthesis algorithm is always a bottom-up traversal as described above. However, synthesis of recursive functions requires an additional post-processing phase which involves an automatic induction proof. We omit a detailed description of such induction proofs since we have described our solution previously [14]. However, we discuss our solution at a high level in Sect. 5.3 where we explain how the previously published approach has been modified to tackle monadic programs in which termination depends on the monadic state.

3.3 Synthesis of Stateful ML Code

Our algorithm for synthesis of stateful ML is very similar to the algorithm described above for synthesis of pure CakeML code. The main differences are:
  • the input HOL terms must be written in a state-and-exception monad, and
  • instead of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figct_HTML.gif and \(\cdot \longrightarrow \cdot \), the derived theorems use https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcu_HTML.gif and \(\cdot \longrightarrow ^{M}\cdot \),
where https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcv_HTML.gif and \(\cdot \longrightarrow ^{M}\cdot \) relate the monad’s state to the references and foreign function interface of the underlying CakeML state (fields https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcw_HTML.gif and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcx_HTML.gif ). These concepts will be described below.
Generic state-and-exception monad. The new generalised synthesis work-flow uses the following state-and-exception monad https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcy_HTML.gif , where https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figcz_HTML.gif is the state type, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figda_HTML.gif is the return type, and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdb_HTML.gif is the exception type.
We define the following interface for this monad type. Note that syntactic sugar is often used: in our case, we write https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdd_HTML.gif (as was done in Sect. 1) when we mean https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figde_HTML.gif .
Functions that update the content of state can only be defined once the state type is instantiated. A function for changing a monad https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdg_HTML.gif to have a different state type is introduced in Sect. 3.5.
Definitions and lemmas for synthesis. We define https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdh_HTML.gif as follows. A CakeML source expression https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdi_HTML.gif is considered to satisfy an execution relation https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdj_HTML.gif if for any CakeML state https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdk_HTML.gif , which is related by https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdl_HTML.gif to the state monad state https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdm_HTML.gif and state assertion https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdn_HTML.gif , the CakeML expression https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdo_HTML.gif evaluates to a result https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdp_HTML.gif such that the relation https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdq_HTML.gif accepts the transition and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdr_HTML.gif holds for state assertion https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figds_HTML.gif . The auxiliary functions https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdt_HTML.gif and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdu_HTML.gif will be described below. The first argument https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdv_HTML.gif can be used to restrict effects to references only, as described a few paragraphs further down.
In the definition above, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdx_HTML.gif and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdy_HTML.gif are used to check that the user-specified state assertion https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figdz_HTML.gif relates the CakeML states and the monad states. Furthermore, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figea_HTML.gif ensures that the separation logic frame rule is true. Both use the separation logic set-up from our previous work on characteristic formulae for CakeML [5], where we define a function https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figeb_HTML.gif which, given a projection https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figec_HTML.gif and CakeML state https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figed_HTML.gif , turns the CakeML state into a set representation of the reference store and foreign-function interface (used for I/O).
The https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figee_HTML.gif in the definition above is a pair https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figef_HTML.gif containing a heap assertion https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figeg_HTML.gif and the projection https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figeh_HTML.gif . We define https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figei_HTML.gif to state that the heap assertion produced by applying https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figej_HTML.gif to the current monad state https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figek_HTML.gif must be true for some subset produced by https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figel_HTML.gif when applied to the CakeML state https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figem_HTML.gif . Here https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figen_HTML.gif is the separating conjunction and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figeo_HTML.gif is true for any heap.
The relation https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figeq_HTML.gif states: any frame https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figer_HTML.gif that is true separately from https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figes_HTML.gif for the initial state is also true for the final state; and if the references-only https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figet_HTML.gif configuration is set, then the only difference in the states must be in the references and clock, i.e. no I/O operations are permitted. The https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figeu_HTML.gif flag is instantiated to true when a pure specification ( https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figev_HTML.gif ) is proved for local state (Sect. 4).
We prove lemmas to aid the synthesis algorithm in construction of proofs. The lemmas shown in this paper use the following definition of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figex_HTML.gif .
Synthesis makes use of the following two lemmas in proofs involving monadic https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figez_HTML.gif and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfa_HTML.gif . For https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfb_HTML.gif , synthesis proves an https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfc_HTML.gif -theorem for https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfd_HTML.gif . For https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfe_HTML.gif , it proves a theorem that fits the shape of the first four lines of the lemma and returns a theorem consisting of the last two lines, appropriately instantiated.

3.4 References, Arrays and I/O

The synthesis algorithm uses specialised lemmas when the generic state-and-exception monad has been instantiated. Consider the following instantiation of the monad’s state type to a record type. The programmer’s intention is that the lists are to be synthesised to arrays in CakeML and the I/O component https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfg_HTML.gif is a model of a file system (taken from a library).
With the help of getter- and setter-functions and library functions for file I/O, users can conveniently write monadic functions that operate over this state type.
When it comes to synthesis, the automation instantiates https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfi_HTML.gif with an appropriate heap assertion, in this instance: https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfj_HTML.gif . The user has informed the synthesis tool that https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfk_HTML.gif is to be a fixed-size array and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfl_HTML.gif is to be a resizable-size array. A resizable-array is implemented as a reference that contains an array, since CakeML (like SML) does not directly support resizing arrays. Below, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfm_HTML.gif asserts that https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfn_HTML.gif relates the value held in a reference at a fixed store location https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfo_HTML.gif to the integer in https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfp_HTML.gif . Similarly, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfq_HTML.gif and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfr_HTML.gif specify a connection for the array fields. Lastly, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfs_HTML.gif is a heap assertion for the file I/O taken from a library.
Automation specialises pre-proved https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfu_HTML.gif lemmas for each term that might be encountered in the monadic functions. As an example, a monadic function might contain an automatically defined function https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfv_HTML.gif for updating array https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfw_HTML.gif . Anticipating this, synthesis automation can, at set-up time, automatically derive the following lemma which it can use when it encounters https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfx_HTML.gif .

3.5 Combining Monad State Types

Previously developed monadic functions (e.g. from an existing library) can be used as part of a larger context, by combining state-and-exception monads with different state types. Consider the case of the file I/O in the example from above. The following https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figfz_HTML.gif theorem has been proved in the CakeML basis library.
This can be used directly if the state type of the monad is the https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgb_HTML.gif type. However, our example above uses https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgc_HTML.gif as the state type.
To overcome such type mismatches, we define a function https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgd_HTML.gif which can bring a monadic operation defined in libraries into the required context. The type of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figge_HTML.gif is https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgf_HTML.gif , for appropriate https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgg_HTML.gif and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgh_HTML.gif .
Our https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgj_HTML.gif function changes the state type. A simpler lifting operation can be used to change the exception type.
For our example, we define https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgk_HTML.gif as a function that performs https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgl_HTML.gif on the https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgm_HTML.gif -part of a https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgn_HTML.gif . (The https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgo_HTML.gif example in Sect. 1 used a similar https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgp_HTML.gif .)
Our synthesis mechanism automatically derives a lemma that can transfer any https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgr_HTML.gif result for the file I/O model to a similar https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgs_HTML.gif result wrapped in the https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgt_HTML.gif function. Such lemmas are possible because of the separation logic frame rule that is part of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgu_HTML.gif . The generic lemma is the following:
And the following is the transferred lemma, which enables synthesis of HOL terms of the form https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgw_HTML.gif for https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgx_HTML.gif -synthesisable https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figgy_HTML.gif .
Changing the monad state type comes at no additional cost to the user; our tool is able to derive both the generic and transferred https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figha_HTML.gif lemmas, when provided with the original https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighb_HTML.gif result.

4 Local State and the Abstract Synthesis Mode

This section explains how we have adapted the method described above to also support generation of code that uses local state and local exceptions. These features enable use of stateful code ( https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighc_HTML.gif ) in a pure context ( https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighd_HTML.gif ). We used these features to significantly speed up parts of the CakeML compiler (see Sect. 6).
In the monadic functions, users indicate that they want local state to be generated by using the following https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighe_HTML.gif function. In the logic, the https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighf_HTML.gif function essentially just applies a monadic function https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighg_HTML.gif to an explicitly provided state https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighh_HTML.gif .
In the generated code, an application of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighj_HTML.gif to a concrete monadic function, say https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighk_HTML.gif , results in code of the following form:
Synthesis of locally effectful code is made complicated in our setting for two reasons: (i) there are no fixed locations where the references and arrays are stored, e.g. we cannot define https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighm_HTML.gif as used in the definition of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighn_HTML.gif in Sect. 3.4; and (ii) the local names of state components must be in scope for all of the function definitions that depend on local state.
Our solution to challenge (i) is to leave the location values as variables ( https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figho_HTML.gif , https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighp_HTML.gif , https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighq_HTML.gif ) in the heap assertion when synthesising local state. To illustrate, we will adapt the https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighr_HTML.gif from Sect. 3.4: we omit https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighs_HTML.gif in the state because I/O cannot be made local. The local-state enabled heap assertion is:
The lemmas referring to local state now assume they can find the right variable locations with variable look-ups.
Challenge (ii) was caused by technical details of our previous synthesis methods. The previous version was set up to only produce top-level declarations, which is incompatible with the requirement to have local (not globally fixed) state declarations shared between several functions. The requirement to only have top-level declarations arose from our desire to keep things simple: each synthesised function is attached to the end of a concrete linear program that is being built. It is beneficial to be concrete because then each assumption on the lexical environment where the function is defined can be proved immediately on definition. We will call this old approach the concrete mode of synthesis, since it eagerly builds a concrete program.
In order to support having functions access local state, we implement a new abstract mode of synthesis. In the abstract mode, each assumption on the lexical environment is left as an unproved side condition as long as possible. This allows us to define functions in a dynamic environment.
To prove a pure specification ( https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighv_HTML.gif ) from the https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighw_HTML.gif theorems, the automation first proves that the generated state-allocation and -initialisation code establishes the relevant heap assertion (e.g. https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighx_HTML.gif ); it then composes the abstractly synthesised code while proving the environment-related side conditions (e.g. presence of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighy_HTML.gif ). The final proof of an https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Fighz_HTML.gif theorem requires instantiating the references-only https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figia_HTML.gif flag to true, in order to know that no I/O occurs (Sect. 3.3).

4.1 Type Variables in Local Monadic State

Our previous approach [14] allowed synthesis of (pure) polymorphic functions. Our new mechanism is able to support the same level of generality by permitting type variables in the type of monadic state that is used locally. As an example, consider a monadic implementation of an in-place quicksort algorithm, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figib_HTML.gif , with the following type signature:
The function https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figid_HTML.gif takes a list of values of type https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figie_HTML.gif and an ordering on https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figif_HTML.gif as input, producing a sorted list as output. However, internally it copies the input list into a mutable array in order to perform fast in-place random accesses.
The heap assertion for https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_IEq14_HTML.gif is called https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figig_HTML.gif , and is defined below:
Here, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figii_HTML.gif is a refinement invariant for logical values of type https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figij_HTML.gif . This parametrisation over state type variables is similar to the way in which location values were parametrised to solve challenge (i) above.
Applying https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figik_HTML.gif to https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figil_HTML.gif , and synthesising CakeML from the result gives the following certificate theorem which makes the stateful https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figim_HTML.gif callable from pure translations.
Here https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_IEq15_HTML.gif is the refinement invariant for type https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_IEq16_HTML.gif .
For the quicksort example, we have manually proved that https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figio_HTML.gif will always return a https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figip_HTML.gif value, provided the comparison function orders values of type https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figiq_HTML.gif . The result of this effort is CakeML code for https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figir_HTML.gif that uses state internally, but can be used as if it is a completely pure function without any use of state or exceptions.

5 Termination That Depends on Monadic State

In this section, we describe how the proof-producing synthesis method in Sect. 3 has been extended to deal with a class of recursive monadic functions whose termination depends on the state hidden in the monad. This class of functions creates new difficulties, as (i) the HOL4 function definition system is unable to prove termination of these functions; and, (ii) our synthesis method relies on induction theorems produced by the definition system to discharge preconditions during synthesis.
We address issue (i) by extending the HOL4 definition system with a set of congruence rewrites for the monadic bind operation, https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figis_HTML.gif (Sect. 5.2). We then explain, at a high level, how the proof-producing synthesis in Sect. 3 is extended to deal with the preconditions that arise when synthesising code from recursive monadic functions (Sect. 5.3).
We begin with a brief overview of how recursive function definitions are handled by the HOL4 function definition system (Sect. 5.1).

5.1 Preliminaries: Function Definitions in HOL4

In order to accept recursive function definitions, the HOL4 system requires a well-founded relation to be found between the arguments of the function, and those of recursive applications. The system automatically extracts conditions that this relation must satisfy, attempts to guess a well-founded relation based on these conditions, and then uses this relation to solve the termination goal.
Function definitions involving higher-order functions (e.g. https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figit_HTML.gif ) sometimes causes the system to derive unprovable termination conditions, if it cannot extract enough information about recursive applications. When this occurs, the user must provide a congruence theorem that specifies the context of the higher-order function. The system uses this theorem to derive correct termination conditions, by rewriting recursive applications.

5.2 Termination of Recursive Monadic Functions

By default, the HOL4 system is unable to automatically prove termination of recursive monadic functions involving https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figiu_HTML.gif . To aid the system in extracting provable termination conditions, we introduce the following congruence theorem for https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figiv_HTML.gif :
https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Equ3_HTML.png
(3)
Theorem (3) expresses a rewrite of the term https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figiw_HTML.gif in terms of rewrites involving its component subterms ( https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figix_HTML.gif , https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figiy_HTML.gif , and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figiz_HTML.gif ), but allows for the assumption that https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figja_HTML.gif (the rewritten effect) must execute successfully.
However, rewriting definitions with (3) is not always sufficient: in addition to ensuring that the effect https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjb_HTML.gif in https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjc_HTML.gif executed successfully, the HOL4 system must also know the value and state resulting from its execution. This problem arises because the monadic state argument to https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjd_HTML.gif is left implicit in user definitions. We address this issue by rewriting the defining equations of monadic functions using \(\eta \)-expansion before passing them to the definition system, making all partial https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figje_HTML.gif applications syntactically fully applied. The whole process is automated so that it is opaque to the user, allowing definition of recursive monadic functions with no additional effort.

5.3 Synthesising ML from Recursive Monadic Functions

The proof-producing synthesis method described in Sect. 3.2 is syntax-directed and proceeds in a bottom-up manner. For recursive functions, a tweak to this strategy is required, as bottom-up traversal would require any recursive calls to be treated before the calling function (this is clearly cyclic).
We begin with a brief explanation of how our previous (pure) synthesis tool [14] tackles recursive functions, before outlining how our new approach builds on this.
Pure recursive functions. As an example, consider the function https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjf_HTML.gif that computes the greatest common divisor of two positive integers:
Before traversing the function body of https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjh_HTML.gif in a bottom-up manner, we simply assume the desired https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figji_HTML.gif result to hold for all recursive applications in the function definition, and record their arguments during synthesis. This results in the following https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjj_HTML.gif theorem for https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjk_HTML.gif (where https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjl_HTML.gif is defined as https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjm_HTML.gif ), and is used to record arguments for recursive applications):
https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Equ4_HTML.png
(4)
and below is the desired https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjn_HTML.gif result for https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjo_HTML.gif :
https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Equ5_HTML.png
(5)
Theorems (4) and (5) match the shape of the hypothesis and conclusion (respectively) of the induction theorem for https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjp_HTML.gif :
By instantiating this induction theorem appropriately, the preconditions in (4) can be discharged (and if automatic proof fails, the goal is left for the user to prove).
Monadic recursive functions. Function definitions whose termination depends on the monad give rise to induction theorems which also depend on the monad. This creates issues, as the monad argument is left implicit in the definition. As an example, here is a function https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjr_HTML.gif that searches through an array for a value:
When given the above definition, the HOL4 system automatically derives the following induction theorem:
https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Equ6_HTML.png
(6)
The context of recursive applications ( https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjt_HTML.gif and https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figju_HTML.gif ) has been extracted correctly by HOL4, using the congruence theorem (3) and automated \(\eta \)-expansion for https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjv_HTML.gif (see Sect. 5.2).
However, there is now a mismatch between the desired form of the https://static-content.springer.com/image/art%3A10.1007%2Fs10817-020-09559-8/MediaObjects/10817_2020_9559_Figjw_HTML.gif result and the conclusion of the induction theorem: the latter depends explictly on the state, but the function depends on it only implicitly. We have modified our synthesis tool to account for this, in order to correctly discharge the necessary preconditions as above. When preconditions cannot be automatically discharged, they are left as proof obligations to the user, and the partial results derived are saved in the HOL4 theorem database.

6 Case Studies and Experiments

In this section, we present the runtime and proof size results of applying our method to some case studies.
Register Allocation. The CakeML compiler’s register allocator is written with a state (and exception) monad but it was previously synthesized to pure CakeML code. We updated it to use the new synthesis tool, resulting in the automatic generation of stateful CakeML code. The allocator benefits significantly from this change because it can now make use of CakeML arrays via the synthesis tool. It was previously confined to using tree-like functional arrays for its internal state, leading to logarithmic access overheads. This is not a specific issue for the CakeML compiler; a verified register allocator for CompCert [3] also reported log-factor overheads due to (functional) array accesses.
Tests were carried out using versions of the bootstrapped CakeML compiler. We ran each test 50 times on the same input program, recording time elapsed in each compiler phase. For each test, we also compared the resulting executables 10 times, to confirm that both compilers generated code of comparable quality (i.e. runtime performance). Performance experiments were carried out on an Intel i7-2600 running at 3.4GHz with 16 GB of RAM. The results are summarized in Table 1. Full data is available at https://​cakeml.​org/​ijcar18.​zip.1
Table 1
Compilation and run times (in s) for various CakeML benchmarks
Timing
Benchmark
Knuth–Bendix
Smith-normal-form
Tailfib
Pidigits
Life
Logic
Compile (old)
18.15
16.34
8.86
9.16
9.51
12.31
Run (old)
19.58
23.53
16.60
15.47
25.59
23.33
Compile (new)
1.21
1.46
0.99
1.02
1.05
1.62
Run (new)
19.90
22.91
16.70
15.64
24.17
22.33
These compare a version of the CakeML compiler where the register allocator is purely functional (old) against a version which uses local state and arrays (new)
In the largest program (KnuthBendix), the new register allocator ran 15 times faster (with a wide 95% CI of 11.76–20.93 due in turn to a high standard deviation on the runtimes for the old code). In the smaller pidigits benchmark, the new register allocator ran 9.01 times faster (95% CI of 9.01–9.02). Across 6 example input programs, we saw ratios of runtimes between 7.58 and 15.06. Register allocation was previously such a significant part of the compiler runtime that this improvement results in runtime improvements for the whole compiler (on these benchmark programs) of factors between 2 and 9 times.
Speeding up the CakeML compiler. The register allocator exemplifies one way the synthesis tool can be used to improve existing, verified CakeML programs and in particular, the CakeML compiler itself. Briefly, the steps are: (i) re-implement slow parts of the compiler with, e.g., an appropriate state monad, (ii) verify that this new implementation produces the same result as the existing (verified) implementation, (iii) swap in the new implementation, which synthesizes to stateful code, during the bootstrap of the CakeML compiler. (iv) The preceeding steps can be repeated as desired, relying on the automated synthesis tool for quick iteration.
As another example, we used the synthesis tool to improve the assembly phase of the compiler. A major part of time spent in this phase is running the instruction encoder, which performs several word arithmetic operations when it computes the byte-level representation of each instruction. However, duplicate instructions appear very frequently, so we implemented a cache of the byte-level representations backed by a hash table represented as a state monad (i). This caching implementation is then verified (ii), before a verified implementation is synthesized where the hash table is implemented as an array (iii). We also iterated through several candidate hash functions (iv). Overall, this change took about 1-person week to implement, verify, and integrate in the CakeML compiler. We benchmarked the cross-compile bootstrap times of the CakeML compiler after this change to measure its impact across different CakeML compilation targets. Results are summarized in Table 2. Across compilation targets, the assembly phase is between 1.25 to 1.64 times faster.
Table 2
CakeML compiler cross-compile bootstrap time (in s) spent in the assembly phase for its various compilation targets
Timing
Cross-compilation target
ARMv6
ARMv8 (\(\dagger \))
MIPS
RISC-V
x64
Assembly (old)
8.86
8.69
9.21
8.27
Assembly (new)
6.43
6.94
6.7
5.04
\(\dagger \) For the ARMv8 target, the cross-compile bootstrap does not run to completion at the point of writing. This is for reasons unrelated to the changes in this paper
OpenTheory Article Checker. The type changing feature from Sect. 3.5 enabled us to produce an OpenTheory [8] article checker with our new synthesis approach, and reduce the amount of manual proof required in a previous version. The checker reads articles from the file system, and performs each logical inference in the OpenTheory framework using the verified Candle kernel [9]. Previously, the I/O code for the checker was implemented in stateful CakeML, and verified manually using characteristic formulae. By replacing the manually verified I/O wrapper by monadic code we removed 400 lines of tedious manual proof.
Effectful code using monads. Our work on encapsulating stateful computations (Sect. 4) in pure programs is similar in purpose to that of the ST monad [12]. The main difference is how this encapsulation is performed: the ST monad relies on parametric polymorphism to prevent references from escaping their scope, whereas we utilise lexical scoping in synthesised code to achieve a similar effect.
Imperative HOL by Bulwahn et al. [4] is a framework for implementing and reasoning about effectful programs in Isabelle/HOL. Monadic functions are used to describe stateful computations which act on the heap, in a similar way as Sect. 3 but with some important differences. Instead of using a state monad, the authors introduce a polymorphic heap monad—similar in spirit to the ST monad, but without encapsulation—where polymorphism is achieved by mapping HOL types to the natural numbers. Contrary to our approach, this allows for heap elements (e.g. references) to be declared on-the-fly and used as first-class values. The drawback, however, is that only countable types can be stored on the heap; in particular, the heap monad does not admit function-typed values, which our work supports.
More recently, Lammich [11] has built a framework for the refinement of pure data structures into imperative counterparts, in Imperative HOL. The refinement process is automated, and refinements are verified using a program logic based on separation logic, which comes with proof-tools to aid the user in verification.
Both developments [4, 11] differ from ours in that they lack a verified mechanism for extracting executable code from shallow embeddings. Although stateful computations are implemented and verified within the confines of higher-order logic, Imperative HOL relies on the unverified code-generation mechanisms of Isabelle/HOL. Moreover, neither work presents a way to deal with I/O effects.
Verified Compilation. Mechanisms for synthesising programs from shallow embeddings defined in the logics of interactive theorem provers exist as components of several verified compiler projects [1, 7, 13, 14]. Although the main contribution of our work is proof-producing synthesis, comparisons are relevant as our synthesis tool plays an important part in the CakeML compiler [10]. To the best of our knowledge, ours is the first work combining effectful computations with proof-producing synthesis and fully verified compilation.
CertiCoq by Anand et al. [1] strives to be a fully verified optimising compiler for functional programs implemented in Coq. The compiler front-end supports the full syntax of the dependently typed logic Gallina, which is reified into a deep embedding and compiled to Cminor through a series of verified compilation steps [1]. Contrary to the approach we have taken [14] (see Sect. 3.2), this reification is neither verified nor proof-producing, and the resulting embedding has no formal semantics (although there are attempts to resolve this issue [2]). Moreover, as of yet, no support exists for expressing effectful computations (such as in Sect. 3.4) in the logic. Instead, effects are deferred to wrapper code from which the compiled functions can be called, and this wrapper code must be manually verified.
The Œuf compiler by Mullen et al. [13] is similar in spirit to CertiCoq in that it compiles pure Coq functions to Cminor through a verified process. Similarly, compiled functions are pure, and effects must be performed by wrapper code. Unlike CertiCoq, Œuf supports only a limited subset of Gallina, from which it synthesises deeply embedded functions in the Œuf-language. The Œuf language has both denotational and operational semantics, and the resulting syntax is automatically proven equivalent with the corresponding logical functions through a process of computational denotation [13].
Hupel and Nipkow [7] have developed a compiler from Isabelle/HOL to CakeML AST. The compiler satisfies a partial correctness guarantee: if the generated CakeML code terminates, then the result of execution is guaranteed to relate to an equality in HOL. Our approach proves termination of the code.

8 Conclusion

This paper describes a technique that makes it possible to synthesise whole programs from monadic functions in HOL, with automatic proofs relating the generated effectful code to the original functions. Using the separation logic from characteristic formulae for CakeML, the synthesis mechanism supports references, exceptions, I/O, reusable library developments, encapsulation of locally stateful computations inside pure functions, and code generation for functions where termination depends on state. To our knowledge, this is the first proof-producing synthesis technique with the aforementioned features.
We hope that the techniques developed in this paper will allow users of the CakeML tools to develop verified code using only shallow embeddings. We hope that only expert users, who develop libraries, will need to delve into manual reasoning in CF or direct reasoning about deeply embedded CakeML programs.

Acknowledgements

Open access funding provided by Chalmers University of Technology. The first and fifth authors were partly supported by the Swedish Foundation for Strategic Research. The seventh author was supported by an A*STAR National Science Scholarship (Ph.D.), Singapore. The third author was supported by the UK Research Institute in Verified Trustworthy Software Systems (VeTSS).
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Fußnoten
1
These tests were performed for the earlier conference version of this paper [6] comparing two earlier versions of the CakeML compiler. The compiler has changed significantly since then but we have we kept these experiments because they provide a fairer comparison of register allocation performance with/without using the synthesis tool to generate stateful code.
 
Literatur
1.
Zurück zum Zitat Anand, A., Appel, A., Morrisett, G., Paraskevopoulou, Z., Pollack, R., Belanger, O.S., Sozeau, M., Weaver, M.: CertiCoq: a verified compiler for Coq. In: CoqPL (2017) Anand, A., Appel, A., Morrisett, G., Paraskevopoulou, Z., Pollack, R., Belanger, O.S., Sozeau, M., Weaver, M.: CertiCoq: a verified compiler for Coq. In: CoqPL (2017)
2.
Zurück zum Zitat Anand, A., Boulier, S., Tabareau, N., Sozeau, M.: Typed template Coq—certified meta-programming in Coq. In: CoqPL (2018) Anand, A., Boulier, S., Tabareau, N., Sozeau, M.: Typed template Coq—certified meta-programming in Coq. In: CoqPL (2018)
3.
Zurück zum Zitat Blazy, S., Robillard, B., Appel, A.W.: Formal verification of coalescing graph-coloring register allocation. In: ESOP, Volume 6012 of LNCS (2010) Blazy, S., Robillard, B., Appel, A.W.: Formal verification of coalescing graph-coloring register allocation. In: ESOP, Volume 6012 of LNCS (2010)
4.
Zurück zum Zitat Bulwahn, L., Krauss, A., Haftmann, F., Erkök, L., Matthews, J.: Imperative functional programming with Isabelle/HOL. In: Mohamed, O.A., Muñoz, C.A., Tahar, S. (eds.) TPHOLs, Volume 5170 of LNCS, pp. 134–149 (2008) Bulwahn, L., Krauss, A., Haftmann, F., Erkök, L., Matthews, J.: Imperative functional programming with Isabelle/HOL. In: Mohamed, O.A., Muñoz, C.A., Tahar, S. (eds.) TPHOLs, Volume 5170 of LNCS, pp. 134–149 (2008)
5.
Zurück zum Zitat Guéneau, A., Myreen, M.O., Kumar, R., Norrish, M.L.: Verified characteristic formulae for CakeML. In: Yang, H. (ed.) ESOP, Volume 10201 of LNCS, pp. 584–610 (2017) Guéneau, A., Myreen, M.O., Kumar, R., Norrish, M.L.: Verified characteristic formulae for CakeML. In: Yang, H. (ed.) ESOP, Volume 10201 of LNCS, pp. 584–610 (2017)
6.
Zurück zum Zitat Ho, S., Abrahamsson, O., Kumar, R., Myreen, M.O., Tan, Y.K., Norrish, M.: Proof-producing synthesis of CakeML with I/O and local state from monadic HOL functions. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) IJCAR, pp. 646–662 (2018) Ho, S., Abrahamsson, O., Kumar, R., Myreen, M.O., Tan, Y.K., Norrish, M.: Proof-producing synthesis of CakeML with I/O and local state from monadic HOL functions. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) IJCAR, pp. 646–662 (2018)
7.
Zurück zum Zitat Hupel, L., Nipkow, T.: A verified compiler from Isabelle/HOL to CakeML. In: Ahmed, A. (ed.) European Symposium on Programming (ESOP). Springer, Berlin (2018) Hupel, L., Nipkow, T.: A verified compiler from Isabelle/HOL to CakeML. In: Ahmed, A. (ed.) European Symposium on Programming (ESOP). Springer, Berlin (2018)
8.
Zurück zum Zitat Hurd, J.: The OpenTheory standard theory library. In: Bobaru, M.G., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NFM, Volume 6617 of LNCS, pp. 177–191 (2011) Hurd, J.: The OpenTheory standard theory library. In: Bobaru, M.G., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NFM, Volume 6617 of LNCS, pp. 177–191 (2011)
9.
Zurück zum Zitat Kumar, R., Arthan, R., Myreen, M.O., Owens, S.: Self-formalisation of higher-order logic-semantics, soundness, and a verified implementation. J. Autom. Reason. 56(3), 221–259 (2016)MathSciNetCrossRef Kumar, R., Arthan, R., Myreen, M.O., Owens, S.: Self-formalisation of higher-order logic-semantics, soundness, and a verified implementation. J. Autom. Reason. 56(3), 221–259 (2016)MathSciNetCrossRef
10.
Zurück zum Zitat Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: a verified implementation of ML. In: Suresh, J., Sewell, P. (eds.) POPL, pp. 179–192 (2014) Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: a verified implementation of ML. In: Suresh, J., Sewell, P. (eds.) POPL, pp. 179–192 (2014)
11.
Zurück zum Zitat Lammich, P.: Refinement to Imperative/HOL. In: ITP, Volume 9236 of LNCS (2015) Lammich, P.: Refinement to Imperative/HOL. In: ITP, Volume 9236 of LNCS (2015)
12.
Zurück zum Zitat Launchbury, J., Peyton Jones, S.L.: Lazy functional state threads. In: Sarkar, V., Ryder, B.G., Soffa, M.L. (eds.) PLDI, pp. 24–35 (1994) Launchbury, J., Peyton Jones, S.L.: Lazy functional state threads. In: Sarkar, V., Ryder, B.G., Soffa, M.L. (eds.) PLDI, pp. 24–35 (1994)
13.
Zurück zum Zitat Mullen, E., Pernsteiner, S., Wilcox, J.R., Tatlock, Z., Grossman, D.: Œuf: minimizing the Coq extraction TCB. In: CPP (2018) Mullen, E., Pernsteiner, S., Wilcox, J.R., Tatlock, Z., Grossman, D.: Œuf: minimizing the Coq extraction TCB. In: CPP (2018)
14.
Zurück zum Zitat Myreen, M.O., Owens, S.: Proof-producing translation of higher-order logic into pure and stateful ML. J. Funct. Program. 24(2–3), 284–315 (2014)MathSciNetCrossRef Myreen, M.O., Owens, S.: Proof-producing translation of higher-order logic into pure and stateful ML. J. Funct. Program. 24(2–3), 284–315 (2014)MathSciNetCrossRef
15.
Zurück zum Zitat Owens, S., Myreen, M.O., Kumar, R., Tan, Y.K.: Functional big-step semantics. In: Thiemann, P. (ed.) ESOP, Volume 9632 of LNCS, pp. 589–615 (2016) Owens, S., Myreen, M.O., Kumar, R., Tan, Y.K.: Functional big-step semantics. In: Thiemann, P. (ed.) ESOP, Volume 9632 of LNCS, pp. 589–615 (2016)
16.
Zurück zum Zitat Reynolds, J.C.: Separation logic: a logic for shared mutable data structures. In: LICS, pp. 55–74 (2002) Reynolds, J.C.: Separation logic: a logic for shared mutable data structures. In: LICS, pp. 55–74 (2002)
Metadaten
Titel
Proof-Producing Synthesis of CakeML from Monadic HOL Functions
verfasst von
Oskar Abrahamsson
Son Ho
Hrutvik Kanabar
Ramana Kumar
Magnus O. Myreen
Michael Norrish
Yong Kiam Tan
Publikationsdatum
06.06.2020
Verlag
Springer Netherlands
Erschienen in
Journal of Automated Reasoning / Ausgabe 7/2020
Print ISSN: 0168-7433
Elektronische ISSN: 1573-0670
DOI
https://doi.org/10.1007/s10817-020-09559-8

Weitere Artikel der Ausgabe 7/2020

Journal of Automated Reasoning 7/2020 Zur Ausgabe