Das Kapitel vertieft sich in die formale Verifizierung von Programmeigenschaften und konzentriert sich auf den algebraischen Ansatz der Kleene Algebra with Tests (KAT) und ihre Erweiterungen. Sie unterstreicht die Bedeutung von relationalen Eigenschaften in probabilistischen Programmen, die in Bereichen wie Kryptographie, Netzwerkprogrammierung und differenzieller Privatsphäre von entscheidender Bedeutung sind. Der Text stellt BiGKAT vor, ein algebraisches Rahmenwerk, das dazu entwickelt wurde, über die relationalen Eigenschaften von probabilistischen Programmen nachzudenken. Es baut auf der Guarded Kleene Algebra with Tests (GKAT) auf, die ein probabilistisches Modell zur Interpretation probabilistischer Programme bietet. Das Kapitel skizziert die Syntax und Semantik von BiGKAT und zeigt, wie es eine Teilmenge der probabilistischen relationalen Hoare-Logik (pRHL) kodieren kann. Anhand detaillierter Beispiele und Beweise veranschaulicht es die Aussagekraft und Solidität von BiGKAT und bietet eine vielversprechende Richtung für die Automatisierung und Verständlichkeit von Verifikationsmethoden in der probabilistischen Programmierung. In der Diskussion wird BiGKAT mit pRHL verglichen, wobei die erhöhte Ausdruckskraft und das Potenzial für zukünftige Erweiterungen betont werden.
KI-Generiert
Diese Zusammenfassung des Fachinhalts wurde mit Hilfe von KI generiert.
Abstract
This work is devoted to formal reasoning on relational properties of probabilistic imperative programs. Relational properties are properties which relate the execution of two programs (possibly the same one) on two initial memories. We aim at extending the algebraic approach of Kleene Algebras with Tests (KAT) to relational properties of probabilistic programs. For that we consider the approach of Guarded Kleene Algebras with Tests (GKAT), which can be used for representing probabilistic programs, and define a relational version of it, called Bi-guarded Kleene Algebras with Tests (BiGKAT) together with a semantics. We show that the setting of BiGKAT is expressive enough to encode a finitary version of probabilistic Relational Hoare Logic (pRHL) (without the While rule), a program logic that has been introduced in the literature for the verification of relational properties of probabilistic programs. We also discuss the additional expressivity brought by BiGKAT.
1 Introduction
Formal verification of program properties has triggered a variety of methods, among which the algebraic approach of Kleene Algebra with Tests (KAT) stands out as an elegant, simple and automatizable framework [16, 18]. It is closely related to modeling with finite automata and has stimulated the development of techniques from coalgebra for reasoning about program behavior, for instance based on bisimulation checking [12]. It has also been implemented in a library for the Coq proof-assistant [19]. Among the properties one might want to check on programs, some important ones are expressed by relating the execution of two programs on two initial states, or of the same program on two initial states. They are called relational properties or 2-properties. One can think for instance of simulation properties, refinements, or extensional equivalence. Another example is non-interference: assume the variables are divided into public ones and private ones, a program satisfies non-interference if the final value of public variables after an execution only depends on the initial value of public variables.
Actually in a large number of situations the software systems one wants to verify are not deterministic but admit a probabilistic behaviour. Think for instance of randomized algorithms, cryptography, network programming or differential privacy. In those scenarios many crucial properties are also relational ones. For instance in cryptography one can express the fact that a randomized encryption scheme is safe as a probabilistic non-interference property: a public variable is assigned a ciphered value, obtained from a private variable, and we want to ensure that one cannot distinguish between two ciphered values computed from the same private initial state. Similarly in differential privacy (see e.g. [5]), in order to protect private data one might want to verify that two executions of a given program on two databases that differ only by one individual give indistinguishable result.
Anzeige
In order to express and prove relational properties on imperative programs some specific methods have been introduced. First in the deterministic case let us mention Relational Hoare Logic [10], that extends the classic Floyd-Hoare logic approach to reason on pairs of programs. This approach has been upgraded to the setting of probabilistic relational Hoare Logic (pRHL) by Barthe and coauthors [6]. It has then been extensively applied to the verification of cryptographic schemes, in particular through the development of the Certicrypt [9] and Easycrypt [3] tools.
However one would still benefit from additional techniques for the automation and the understandability of such reasoning methods. In particular one difficulty with (probabilistic) relational Hoare Logic is to find a suitable alignment of the two programs in order to be able, in a second step, to find the intermediate properties needed for the proof (see [1]). Algebraic methods coming from Kleene algebra with tests are promising in these respects. In particular they facilitate the reasoning on simple program transformations. Such direction is already addressed with the introduction of BiKAT [1], allowing to apply the KAT approach to reasoning on pairs of programs.
Our goal is thus to introduce a KAT approach to reason on relational properties of probabilistic programs. Unfortunately standard KAT techniques cannot be applied directly to probabilistic programs, since there is no known probabilistic interpretation for KAT. To handle this question, recent progress was made by the introduction of Guarded Kleene Algebra with tests (GKAT) [22], in which non-deterministic union and iteration are replaced by guarded union and iteration, while being sufficiently expressive to model imperative programming languages. The main motivation for the introduction of GKAT was initially to design a more efficient version of KAT where the complexity of the decision procedure is reduced, but it was also shown that GKAT admits a probabilistic model that can be used to interpret probabilistic programs.
Our strategy is thus to adapt the relational extension BiKAT to the setting of GKAT, in order to apply this relational approach to pairs of probabilistic programs. In practice we will consider programs of an imperative language extended with some probabilistic primitives. We call the corresponding calculus BiGKAT and in this paper define for it a syntax, a denotational semantics and a theory, which we prove to be sound. We would like to demonstrate the expressivity of our framework by showing how probabilistic relational Hoare Logic reasoning can be encoded in it, in an analogous way as (standard) Hoare logic can be encoded in KAT [17] and Relational Hoare Logic in BiKAT [1] (see 1). In the present work we make a first step in this direction, by showing that a finitary version of pRHL (without the While rule) can be encoded.
Table 1.
Program logics and algebras
Dummy
Anzeige
Outline. We first recall the definition of Guarded Kleene algebra with tests GKAT (Sect. 2), then introduce BiGKAT (Sect. 3) and describe its syntax, semantics, theory, as well as en encoding of a subset of pRHL. We present en example in Sect. 4 before discussing in Sect. 5 the differences between BiGKAT and pRHL and more generally related work in Sect. 6. We finish in Sect. 7 with conclusions and perspectives of future work.
Because of space constraints some proofs are omitted in this paper but can be found in the Appendix of the technical report [15].
2 Guarded Kleene algebra with tests
This section recalls the language and the semantics of Guarded Kleene Algebra with Tests (GKAT) [22], an abstraction of imperative programs where conditionals (\(c_1+_b c_2\)) and loops (\(c^{(b)}\)) are expressions guarded by Boolean predicates b. As explained before, the structure is a restriction of KAT in which we are not allowed to freely use operators \(+\) and \(^*\) to build expressions, i.e. GKAT does not allow nondeterminism. Although less expressive than KAT, GKAT offers two advantages: decidability in (almost) linear time (compared to PSPACE complexity of decidability in KAT), and better foundation for probabilistic applications. Although the first one was the main motivation to introduce the structure [22], we are more interested in the second advantage for the purpose of this paper.
2.1 Syntax
The syntax of GKAT is defined with a set of actions\(\varSigma \) and a finite set of primitive tests \(\texttt {T}\), which are disjoint. We denote actions by a and primitive tests by p. The sets of Boolean expressions BExp (also called tests) and GKAT expressions Exp (also called programs) are then defined by the following grammars:
where, for any \(b,b_1,b_2\in \texttt {BExp}\), operators \(\cdot \), \(+\) and
denote conjunction, disjunction and negation, respectively, and, for any \(c,c_1,c_2\in \texttt {Exp}\), the operator \(\cdot \) denotes sequential composition. The notations on the r.h.s. are given to help intuition and will sometimes be used when writing programs. Command skip will be a shorthand for assert 1, which is encoded by the Boolean expression 1.
The precedence of the operators is the usual one, i.e. the operator \(\cdot \) has higher precedence than operator \(+_b\), and \(()^{(b)}\) has higher precedence than \(\cdot \)1 To simplify the writing, we often omit the operator \(\cdot \) by writing \(c_1c_2\) for the expression \(c_1\cdot c_2\), for any \(c_1,c_2\in \texttt {Exp}\).
We are interested in using GKAT for representing probabilistic programs. For that, let us first fix a few definitions. Given a set S, \(\mathcal {D}(S)\) is the set of probability sub-distributions2 over S with discrete support, i.e. the set of functions \(\mu :S\rightarrow [0,1]\) such that \(Supp(\mu )=\{x\in S \,| \, \mu (x)>0 \}\) is discrete and \(\mu \) sums up to at most 1, i.e. \(\sum \limits _{s\in S}\mu (s)\le 1\). In particular, the Dirac distribution \(\delta _s\in \mathcal {D}(S)\) is the following map: \(w\rightarrow {\left\{ \begin{array}{ll} 1\text{, } \text{ if } w=s\\ 0\text{, } \text{ otherwise. } \end{array}\right. }\)
Example 1 (Imperative programming language)
Take a set Var of variables and a set Distr of sub-distributions over \(\mathbb {R}\) with discrete support. Consider a simple imperative programming language defined by the following grammar:
This language can be modeled in GKAT by taking as sets of actions and primitive tests respectively
and \(\texttt {T}=\{t_1<t_2,\, t_1=t_2 \, \mid t_1, t_2 \in \texttt {Terms}\}\)3 The first action evaluates term t and assigns the result to x; the second one samples from d and assigns the result to x.
Observe that while programs c may be probabilistic, due to the use of samplings, the tests b as for them are deterministic, i.e. they do not use any probabilistic primitives. In particular the conditional branching in programs is only done on deterministic tests.
2.2 Semantics
We now present the semantic interpretation of GKAT that we will be using, the Probabilistic model [22]4. We first review some basic concepts needed for the semantics. Given a statement \(\phi \) over a set S, the Iverson bracket\([\phi ]\) is the function on S taking value 1 on \(s\in S\) if \(\phi (s)\) is true and 0 if it is false. Typical models of probabilistic imperative programming languages interpret programs as Markov kernels on a set S, i.e. maps from S to probability distributions. The semantic model defined below interprets programs as sub-Markov kernels F, G...i.e. Markov kernels on sub-distributions.
Two particular examples of sub-Markov kernel on S are \(id_S\) and \(0_S\) respectively defined by, for any s, \(s'\in S\): \( id_S (s) =\delta _{s}\) and \(0_S (s)(s') = 0\).
The composition of two sub-Markov kernels F, G is \(F ;G\) defined by
\( (F ;G) (s)(s')= \sum \limits _{s'' \in S} F(s)(s'')\times G(s'')(s')\) . This composition is associative, admits \(id_S\) as identity and \(0_S\) as absorbing element: for any F, \(F ;id_S = id_S ;F = F\) and \(F ;0_S = 0_S ;F = 0_S\).
Definition 1 (Probabilistic interpretation)
Let \(i=(S,eval,sat)\) be a triple:
S is a set of states,
for each action \(a\in \varSigma \), \(eval(a): S\rightarrow \mathcal {D}(S)\) is a sub-Markov kernel,
for each primitive test \(p\in \texttt {T}\), \(sat(p)\subseteq S\) is a set of states.
We define \({sat}^\dag :\texttt {BExp}\rightarrow 2^{S}\) as the lifting of \({sat}:\texttt {T}\rightarrow 2^{S}\) to arbitrary Boolean expressions over \(\texttt {BExp}\). The probabilistic interpretation of an expression c with respect to i is the sub-Markov kernel \(\mathcal {P}_i\llbracket c\rrbracket :S\rightarrow \mathcal {D}(S)\) defined as follows:
Intuitively \(\mathcal {P}_i\llbracket c\rrbracket (s)(s')\) is the probability that the execution of c on initial state s terminates on state \(s'\), and \(\sum \limits _{s'\in S} \mathcal {P}_i\llbracket c\rrbracket (s)(s')\) is the probability that the execution of c on initial state s terminates (we then also say that it is a successful execution). Observe thus that we really need to consider sub-distributions and not only distributions. Note that the definition implies that \(\mathcal {P}_i\llbracket 1\rrbracket =id_{S}\) and \(\mathcal {P}_i\llbracket 0\rrbracket =0_{S}\). In the sequel, when the interpretation i is fixed we will sometimes write \(\llbracket c\rrbracket \) instead of \(\mathcal {P}_i\llbracket c\rrbracket \).
In the following we will consider programs over a finite set of variables \(\texttt {Var}\) and the set of states will be the set of memories, that we denote by m, i.e. functions in \(\texttt {Var} \rightarrow D\) where D is the domain of values (we can take for instance \(D=\mathbb {Q}\), the rational numbers). If \(x\in \texttt {Var}\) and m is a memory, then \(m[x\leftarrow t]\) is the memory identical to m except that it maps x to the evaluation of t in memory m. The interpretation of actions \(a\in \texttt {Exp}\) as sub-Markov kernels is then given by \(eval(x\leftarrow t)(m):=\delta _{m[x\leftarrow t]}\) and
.
Example 2
Let us consider the example of the uniform distribution over the two-elements boolean set \(Bool=\{tt, ff\}\) (or unbiased coin), that we call dbool: \(dbool(tt)=dbool(ff)=1/2\).
Then we have
.
2.3 Axioms
The theory of GKAT introduced in [22] is given by the axioms from Fig. 1. Note in particular for the fixpoint axiom (15). Intuitively, it says that if expression \(c_3\) chooses (using guard b) between executing \(c_1\) and looping again, and executing \(c_2\), then \(c_3\) is a b-guarded loop followed by \(c_2\). However, the rule is not sound in general (see [22] for more details). In order to overcome such limitation, the side condition \(E(c_1)=0\) is introduced, ensuring that command \(c_1\) is productive, i.e. that it performs some action. To this end, the function E is inductively defined as follows: \(E(b):=b\), \(E(a):=0\), \(E(c_1+_b c_2):=b\cdot E(c_1)+\lnot b\cdot E(c_2)\), \(E(c_1\cdot c_2):=E(c_1)\cdot E(c_2)\), \(E(c^{(b)}):=\lnot b\). We can see E(c) as the weakest test that guarantees that command c terminates successfully but does not perform any action.
Fig. 1.
Axiomatisation of Guarded Kleene algebra with tests
Moreover, note particularly the following observation: in KAT the encoding
\(c_1;(b;c_2+\lnot b;c_3)=c_1;b;c_2+c_1;\lnot b;c_3\) is not an if-then-else statement; it is rather a nondeterministic choice between executing \(c_1\), then testing b and executing \(c_2\), and executing \(c_1\), then testing \(\lnot b\) and executing \(c_3\). That is why left distributivity does not hold in GKAT for any \(c\in \text {Exp}\); it only holds for the particular case of \(b\in \text {BExp}\), i.e. if b is a test.
In [15] we list additional derivable equations in GKAT, also given in [22].
We already mentioned that GKAT does not allow to construct an arbitrary program by using freely the nondeterministic choice operator \(+\), allowing only guarded choice \(+_b\), for any \(b\in \text {BExp}\). However, the \(+\) operator is included in the grammar of BExp, representing the Boolean disjunction. Since BExp\(\subseteq \)Exp, the grammar allows to write expressions as \(b_1+_b b_2\), for any \(b\in \text {BExp}\).
By Boolean reasoning, we can observe that \(b\cdot b+\lnot b\cdot \lnot b=1\). Such property will be useful later to prove the soundness of R-Case rule (39).
3 Bi-guarded Kleene algebra with tests
We will define an algebra called BiGKAT, based on GKAT and which will allow us to reason on relations between two probabilistic programs, that we refer to as left and right programs. Just as GKAT it will be defined by a grammar of tests and a grammar of expressions. They can be thought of as tests over a product state space \(S \times S\) and probabilistic programs over the same product state space. We will give them a semantics of sub-Markov kernels over \(S \times S\).
3.1 Syntax
The syntax of BiGKAT is defined using that of GKAT and is additionnally parameterized by a set of actions \(\mathbb {\Sigma }\) and a finite set of primitive tests \(\mathbb {P}\) which are disjoint. Elements of \(\mathbb {\Sigma }\) are denoted as A and those of \(\mathbb {P}\) are denoted as P. A typical choice will be to consider in \(\mathbb {P}\) some tests relating variables of the two programs on the two state spaces such as for instance \([x=x']\), and to consider in \(\mathbb {\Sigma }\) some couplings between random assignments. The language of Bi-guarded Kleene algebra with tests (BiGKAT) consists of expressions in \(\ddot{\text {E}}\text {xp}\) constructed from GKAT expressions c, \(c'\) in Exp, as follows:
We will sometimes omit
and write \(C_1 C_2\) for
, and \(\overline{B}\) for \(\overline{\lnot }{B}\).
We define the notation \(\langle \_ \vert \_ \rangle \) as
.
Note that in the expression \(\underset{C}{\{c\mid c'\}}\), c and \(c'\) belong to Exp (so GKAT) while the bottom C belongs to \(\ddot{\text {E}}\text {xp}\) (BiGKAT). The intuition behind this expression is that C is a joint kernel over \(S\times S\), whose left and right projections are respectively c and \(c'\), and that \(\underset{C}{\{c\mid c'\}}\) denotes C itself. We will make this intuition more precise in the following section by defining the semantic interpretation by sub-Markov kernels.
3.2 Semantics of BiGKAT
As for interpreting GKAT we were using sub-Markov kernels over a given set S, now for BiGKAT for interpreting pairs of programs we will consider sub-Markov kernels over the product \(S^2\). We will denote sub-Markov kernels on S by lower-case letters c, \(c'\), \(c_1\) ...and those on \(S^2\) by upper-case letters C, \(C'\), \(C_1\), D .... In order to recover the interpretation of the left and right programs we will need the notion of projections or marginals. For that, given a sub-distribution \(\mu \) on \(S^2\) we denote its left and right marginals respectively as: \( \varPi _1(\mu )(s)=\sum \limits _{s'\in S} \mu (s,s'), \quad \varPi _2(\mu )(s')=\sum \limits _{s \in S} \mu (s,s').\) Moreover if E is a subset of \(S^2\), denote \( \varPi _1(E)=\{s /\exists s'\in S, (s,s')\in E\}\) and \( \varPi _2(E)=\{s' /\exists s\in S, (s,s')\in E\}\).
Example 3
Let us consider the set \(Bool=\{tt,ff\}\) and the distributions on the product \(Bool^2\) defined on Fig. 2 (the subscripts p, s, a are respectively for product, symmetric and antisymmetric). The array notation here means that the value of \(\mu _p(x,x')\) is given by the coefficient on the line (resp. column) given by x (resp. \(x'\)).
Fig. 2.
Three distributions on \(Bool^2\)
By computing the left and right marginals, one obtains the following equalities, where dbool is the uniform distribution on Bool (see Example 2): \(\varPi _1(\mu _p)= \varPi _1(\mu _s)=\varPi _1(\mu _a)= dbool, \qquad \varPi _2(\mu _p)= \varPi _2(\mu _s)=\varPi _2(\mu _a)= dbool\).
A simple way of constructing sub-Markov kernels on \(S^2\) is by using products. The product of two sub-Markov kernels c and \(c'\) on S is defined as \(c\times c':(s_1,s_1')\rightarrow \big ((s_2,s_2')\rightarrow c(s_1)(s_2)\times c'(s_1')(s_2')\big )\).
Let us show an example of sub-Markov kernel on \(S^2\) in the case where S is a state of memories, as in Sect. 1. Assume \(\mu \) is a distribution over a product space \(D^2\) where D is a domain for a variable x (for instance \(D=Bool\), as in Example 3). We define the sub-Markov kernel \(C_{\mu }\) in a way similar to
in Sect. 1 but over \(S^2\): \( C_{\mu }(m,m')=\sum \limits _{(t,t') \in Supp(\mu )}\mu (t,t')\cdot \delta _{(m[x\leftarrow t],m'[x'\leftarrow t'])}.\) In other words: \( C_{\mu }(m,m')(m_1,m'_1)=0\) if there exists \(y\ne x\) such that \(m_1(y)\ne m(y)\) or \(y'\ne x'\) such that \(m'_1(y')\ne m'(y')\), and otherwise \( C_{\mu }(m,m')(m_1,m'_1)=\mu (t,t')\) where \(t=m_1(x)\), \(t'=m'_1(x')\). So Example 3 for instance allows to define the sub-Markov kernels \(C_{\mu _p}\), \(C_{\mu _s}\), \(C_{\mu _a}\).
Let us now consider the marginals of the sub-distributions \(C_{\mu }(m,m')\):
In the case where \(\mu \) is anyone of the three distributions \(\mu _p\), \(\mu _s\), \(\mu _a\) of Example 3 one thus obtains, following the definition in Example 2:
Definition 2
We say that two sub-Markov kernels \(C_1,C_2\) on \(S^2\) are equivalent, denoted by \(C_1\equiv C_2\), if \(\forall _{(s,s')\in S^2}, i=1,2, \varPi _i(C_1(s,s'))=\varPi _i(C_2(s,s'))\).
For instance we can deduce from Example 4 that: \( C_{\mu _p} \equiv C_{\mu _s} \equiv C_{\mu _a}.\)
Definition 3
We say that a sub-Markov kernel C on \(S^2\) is separable if there exists sub-Markov kernels \(\mathcal {L}^{C}(.)\), \(\mathcal {R}^{C}(.)\) on S such that, for all \((s,s')\): \(\varPi _1(C(s,s'))=\mathcal {L}^{C}(s)\) and \(\varPi _2(C(s,s'))=\mathcal {R}^{C}(s')\).
Example 4 shows that \(C_{\mu _p}\), \(C_{\mu _s}\) and \(C_{\mu _a}\) are separable. Now, given a probabilistic interpretation \(\mathcal {P}_i\llbracket .\rrbracket \) of GKAT we want to define a probabilistic interpretation \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \) of BiGKAT.
Definition 4 (Probabilistic interpretation of BiGKAT)
A probabilistic interpretation of BiGKAT is defined by a probabilistic interpretation
\(i=(State,eval,sat)\) of GKAT and two functions \(\text {Eval}\) and \(\text {Sat}\) such that:
for each action \(A\in \mathbb {\Sigma }\), \(\text {Eval}(A): S^2\rightarrow \mathcal {D}(S^2)\) is a sub-Markov kernel,
for each primitive test \(P \in \mathbb {P}\), \(\text {Sat}(P)\subseteq S^2\) is a set of states.
We want to define the interpretation \(\overline{\mathcal {P}_i}\llbracket B\rrbracket \) and \(\overline{\mathcal {P}_i}\llbracket C\rrbracket \) of expressions in \(\ddot{\text {B}}\text {Exp}\) and \(\ddot{\text {E}}\text {xp}\). This interpretation will actually only be partially defined, that is to say in some cases \(\overline{\mathcal {P}_i}\llbracket C\rrbracket \) is undefined.
For B in \(\ddot{\text {B}}\text {Exp}\), \(\overline{\mathcal {P}_i}\llbracket B\rrbracket \) is defined in the same way as \(\mathcal {P}_i\llbracket b\rrbracket \) for b in BExp, using \(\text {Sat}\).
For \(A\in \mathbb {\Sigma }\), \(\overline{\mathcal {P}_i}\llbracket A\rrbracket \) is defined as \(\overline{\mathcal {P}_i}\llbracket A\rrbracket =\text {Eval}(A)\). The interpretations of constructs
, \(\overline{\mathcal {P}_i}\llbracket C_1 \oplus _{B} C_2\rrbracket \), \(\overline{\mathcal {P}_i}\llbracket C^{(B)}\rrbracket \) are defined in the same way as the interpretations of the similar constructs of GKAT expressions with \(\mathcal {P}_i\llbracket .\rrbracket \), except that the state is now \(S^2\) instead of S.
There thus only remains to define \(\overline{\mathcal {P}_i}\llbracket \langle c\vert \rrbracket \), \(\overline{\mathcal {P}_i}\llbracket \vert c\rangle \rrbracket \) and \(\overline{\mathcal {P}_i}\llbracket \underset{C}{\{c\mid c'\}}\rrbracket \).
\(\overline{\mathcal {P}_i}\llbracket \langle c\vert \rrbracket \) is defined as \(\overline{\mathcal {P}_i}\llbracket \langle c\vert \rrbracket =\mathcal {P}_i\llbracket c\rrbracket \times id_{S}\).
\(\overline{\mathcal {P}_i}\llbracket \vert c\rangle \rrbracket \) is defined as \(\overline{\mathcal {P}_i}\llbracket \vert c\rangle \rrbracket =id_{S} \times \mathcal {P}_i\llbracket c\rrbracket \).
\(\overline{\mathcal {P}_i}\llbracket \underset{C}{\{c\mid c'\}}\rrbracket \) is defined only if \(\overline{\mathcal {P}_i}\llbracket C\rrbracket \) is defined and if we have for all s, \(s'\in S\):
and in this case we define \(\overline{\mathcal {P}_i}\llbracket \underset{C}{\{c\mid c'\}}\rrbracket =\overline{\mathcal {P}_i}\llbracket C\rrbracket \).
As a consequence we have:
Lemma 1
For all c, \(c'\) in Exp we have \(\overline{\mathcal {P}_i}\llbracket \langle c \vert c' \rangle \rrbracket =\mathcal {P}_i\llbracket c\rrbracket \times \mathcal {P}_i\llbracket c'\rrbracket \).
Now, remember the interpretation of composition by pre- and postconditions in GKAT, \(\mathcal {P}_i\llbracket c\cdot b\rrbracket \) and \(\mathcal {P}_i\llbracket b\cdot c\rrbracket \), given in (1) and (2). It also holds in the same way for BiGKAT, as the interpretations are the same as sub-Markov kernels over \(S^2\). As a consequence we have, for the sub-distributions of Example 3:
Example 5
\(C_{\mu _s}= C_{\mu _s}\, \overline{\mathcal {P}_i}\llbracket [x=x']\rrbracket , \qquad C_{\mu _a}= C_{\mu _a}\, \overline{\mathcal {P}_i}\llbracket [x=\overline{x'}]\rrbracket \) (where \(\overline{x'}\) is the negation of x).
3.3 Theory of BiGKAT
Now we give a list of axioms on BiGKAT expressions, using predicates \(=\) and \(\equiv \), that will be interpreted by equality and equivalence on sub-Markov kernels.
The functions \(\langle \_\vert :\text {Exp}\rightarrow \ddot{\text {E}}\text {xp}\), \(\vert \_\rangle :\text {Exp}\rightarrow \ddot{\text {E}}\text {xp}\) satisfy, for any \(b_1\), \(b_2\), \(b\in B\), \(c_1\), \(c_2\), \(c\in C\), the following properties:
Similarly for \(\vert \_\rangle \). We say that \(\langle \_\vert \) and \(\vert \_\rangle \) are homomorphisms. The operators have the same precedence as in GKAT.
The following property on \(\langle .\vert \) and \(\vert .\rangle \):
(24)
The operators have the same precedence as in GKAT. For readability we use interchangeably the same notation for operators in GKAT and BiGKAT, i.e. operators \(~{\cdot }\), \(~{\lnot }\) and \(~{+_e}\), for any \(~{e\in B}\), and constants 1 and 0 in GKAT stand for
, \(~{\overline{\lnot }}\), \(~{\oplus _{\langle e\vert }}\) (\(~{\oplus _{\vert e\rangle }}\)), \(\ddot{\text {1}}\, \text {and} \ddot{\text {0}}\) , respectively. Often we go even further and omit the operator \(\cdot \) and we write \(\langle c_1\vert \langle c_2\vert \) (\(\vert c_1\rangle \vert c_2\rangle \)) for \(\langle c_1\vert \cdot \langle c_2\vert \) (\(\vert c_1\rangle \cdot \vert c_2\rangle \)).
Note that property (24) states a commutativity property between programs that run in parallel, but we do not have in general
for \(c_1\), \(c_2\) in \(\ddot{\text {E}}\text {xp}\) (and similarly for operator \(\vert \_\rangle \)).
The following properties on
:
(25)
(26)
GKAT axioms for\(=\) (see Fig. 1) on GKAT expressions,
GKAT axioms for \(=\) (see Fig. 1) written in the language of BiGKAT for BiGKAT expressions:
for instance equation (3) becomes \( C \oplus _{B} C= C\) and (8) becomes
.
Axioms for \(\equiv \):
(27)
(28)
(29)
We call this list of axioms the theory of BiGKAT. We say that a formula \(C_1=C_2\) (resp. \(C_1 \equiv C_2\)) is well-defined for an interpretation \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \) if \(\overline{\mathcal {P}_i}\llbracket C_1\rrbracket \) and \(\overline{\mathcal {P}_i}\llbracket C_2\rrbracket \) are both defined. An implicative formula \(F_1 \Rightarrow F_2\) is well-defined for \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \) if both \(F_1\) and \(F_2\) are well-defined.
Now we say that an interpretation \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \) satisfies a formula F if F is well-defined for \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \) and if moreover:
1.
if F is of the fom \(C_1=C_2\) (resp. \(C_1 \equiv C_2\)) then \(\overline{\mathcal {P}_i}\llbracket C_1\rrbracket = \overline{\mathcal {P}_i}\llbracket C_2\rrbracket \) (resp. \(\overline{\mathcal {P}_i}\llbracket C_1\rrbracket \equiv \overline{\mathcal {P}_i}\llbracket C_2\rrbracket \))
2.
if F is of the form \(C_1=C_2\Rightarrow C_3\equiv C_4\) and if \(\overline{\mathcal {P}_i}\llbracket C_1\rrbracket = \overline{\mathcal {P}_i}\llbracket C_2\rrbracket \), then \(\overline{\mathcal {P}_i}\llbracket C_3\rrbracket \equiv \overline{\mathcal {P}_i}\llbracket C_4\rrbracket \) (and similarly for other implicative formulas built with \(=\) and \(\equiv \)).
An example of formula which can in some cases not be well-defined is equation (25), because for \(\overline{\mathcal {P}_i}\llbracket \underset{C}{\{c\mid c'\}}\rrbracket \) to be defined its projections need to satisfy the equations (16).
Proposition 1
The interpretation \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \) of BiGKAT expressions by sub-Markov kernels on the product space \(S^2\) defined in Sect. 3.2 satisfies all well-defined axioms of the theory of BiGKAT.
Proof
The properties (17) - (23) are obtained directly from the semantics of BiGKAT (Definition 4). The proofs of properties (24), (25), (27) - (29) are in [15].
For GKAT axioms (Fig. 1) formulated for BiGKAT, note that since the probabilistic interpretation of BiGKAT expressions \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \) is the same as the one of GKAT (on product space \(S^2\)), we naturally have that such interpretation satisfies the axioms of Fig. 1 written in the language of BiGKAT. \(\square \)
We give in [15] some properties as consequences of the theory of BiGKAT.
We now derive proofs in BiGKAT in the following way. We assume given an interpretation \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \) and a finite subset of elements A of \(\mathbb {\Sigma }\) together with some equations: \(A = \underset{A}{\{c\mid c'\}}\) and
which are well-defined and satisfied by \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \). Then we can use these equations and any formula of the theory of BiGKAT which is well-defined and satisfied by \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \) to derive new formulas. Proposition 1 then ensures that any formula derived in this way is satisfied by \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \).
Note that additionnally, if we allow ourselves to use in the proof any formula \(C=C'\) which is satisfied by \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \) (semantic hypothesis), we still preserve the fact that the formulas derived are satisfied by \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \).
Example 6
Consider again the set space of memories. Consider two elements \(A_s\), \(A_a\) of \(\mathbb {\Sigma }\) with equations:
(30)
(31)
Then if we choose \(\overline{\mathcal {P}_i}\llbracket A_s\rrbracket =C_{\mu _s}\) and \(\overline{\mathcal {P}_i}\llbracket A_a\rrbracket =C_{\mu _a}\), we know by Example 4 and Example 5 that \(\overline{\mathcal {P}_i}\llbracket .\rrbracket \) satisfies the formulas above. Let us give a small example of proof:
3.4 Encoding of a subsystem of pRHL into BiGKAT
We want to encode (a part of) probabilistic relational Hoare logic (pRHL) in BiGKAT. Recall that Hoare logic can be encoded in KAT [17] by encoding a Hoare triple \(\{\phi \} \; c \; \{\psi \}\) as a KAT equation \(\phi \cdot c =\phi \cdot c \cdot \psi \). The intuitive meaning of the equation is that testing \(\psi \) after executing \(\phi \cdot c\) is always redundant. This approach has been extended to the relational setting with RHL and BiKAT in [1]. To define such an encoding for pRHL and BiGKAT we need first to recall the definition and semantics of pRHL judgements and proofs [6]. We consider the language of probabilistic programs of Example 1, a state space S of memories and a probabilistic interpretation \(\mathcal {P}_i\llbracket .\rrbracket \).
Let us first define the range of a subdistribution \(\mu \) on \(S^2\): \(range(\mu )=\{(m,m')\in S \; /\; \mu (m,m')>0 \}\).
A pRHL judgement is a tuple of the form \(c\sim c':\phi \Rightarrow \psi \) where c, \(c'\) are programs and \(\phi \), \(\psi \) are predicates on \(S^2\) (relations on states). For simplicity we will also denote as \(\phi \) the subset of elements in \(S^2\) satisfying \(\phi \).
One says that the pRHL judgement is valid in the interpretation\(\mathcal {P}_i\llbracket .\rrbracket \), denoted as \(\models _i c\sim c':\phi \Rightarrow \psi \) if:
for any \((m,m') \in \phi \), there exists a subdistribution \(\mu \) on \(S^2\) such that: \(\varPi _1(\mu )=\mathcal {P}_i\llbracket c\rrbracket (m)\), \(\varPi _2(\mu )=\mathcal {P}_i\llbracket c'\rrbracket (m')\), and \(range(\mu ) \subseteq \psi \). One then says that programs c and \(c'\) are equivalent with respect to precondition \(\phi \) and postcondition \(\psi \). If the interpretation i is fixed we write \(\models \) instead of \(\models _i\).
Following the above interpretation, we encode the pRHL judgment in BiGKAT as follows:
(32)
where \(\phi , \psi \in \ddot{\text {B}}\text {Exp}\) and \(c,c'\in \text {Exp}\).
Note that we do not use the encoding
since in GKAT and BiGKAT there is no natural notion of order \(\le \) as in KAT [16, 18];
We do not use either the encoding
. In KAT, \(\phi \cdot c =\phi \cdot c \cdot \psi \) is equivalent to \(\phi \cdot c \cdot \lnot \psi =0 \), but this cannot be proved in the same way in GKAT and the equivalence might not hold. We only have the implication \((\phi \cdot c =\phi \cdot c \cdot \psi ) \Rightarrow (\phi \cdot c \cdot \lnot \psi =0 )\), and we choose as encoding the stronger property. This encoding aligns with the intuitive interpretation of the validity of a Hoare triple, i.e. from a state satisfying the pre-condition \(\phi \), each execution of \(c, c'\), if it halts, it leads to a state satisfying the post-condition \(\psi \).
Observe that the semantic interpretation of (32) is the same as \(\models _i c\sim c':\phi \Rightarrow \psi \).
We display on Fig. 3 the rules of pRHL defined in [6]5, except the rule for While, that we replace by an iteration rule (R-Iter rule) which we will explain below. This is a subsystem of pRHL, but we keep here the name pRHL for convenience.
Fig. 3.
Probabilistic Relational Hoare Logic rules (pRHL)
We use different notation for pre and post conditions (\(\phi \), \(\psi \)) and for guards (\(\langle b\vert \), \(\vert b'\rangle \)). Note in particular the side condition
in rule R-Cond, where the right-hand side
is equivalent to \(\langle b \vert b' \rangle +\langle \lnot b \vert \lnot b' \rangle \), so the following holds
$$\begin{aligned} \phi \langle b \vert \lnot b' \rangle =0\;\;\; \phi \langle \lnot b \vert b' \rangle =0 \end{aligned}$$
These equalities assure that the predicates b and \(b'\) are evaluated to the same value on both left and right programs [1]. In particular, for the R-Cond rule it means that the same branch is executed for right-hand side and left-hand side programs. Observe that similarly as for Hoare logic, some rules of pRHL, namely axiom rules R-Assign, R-Assign left and R-Rand assign (see [15]) do not depend on pRHL judgements as premises but rather on an interpretation of actions and predicates, and a semantic condition (for R-Rand assign). Thus we do not expect to derive their encoding as an equation valid in the theory of BiGKAT. Instead, when we deal with examples we will consider a particular interpretation and thus reason on equalities of expressions in the model. The first rule derives a valid Hoare triple with the substitution of variables \(x,x'\) by expressions v, \(v'\), respectively; the second one derives a valid triple with samplings over distributions \(d,d'\). The R-Iter rules means that if the execution of program c does not change \(\phi \), then the composition of cn times does not change \(\phi \). In R-Iter on Fig. 3, \(c^n=c\cdot c^{n-1}\), where \(c^1=c\).
Let us explain the R-Rand assign rule: \(h\vartriangleleft (d,d')\) means that h is a coupling between distributions d, \(d'\), that is to say a bijective function from Supp(d) to \(Supp(d')\) such that: for every \(v\in Supp(d)\), \(P_{x\sim d}[x=v]=P_{x\sim d'}[x=h(v)]\). For instance \(\mu _s\) and \(\mu _a\) in Ex. 6 respectively correspond to couplings \(h=id\) and negation on Bool.
Now, to show that the rules of Fig. 3 are sound in BiGKAT, we interpret them as in Fig. 46, by using the encoding of pRHL judgements as BiGKAT equations defined previously.
Fig. 4.
Encoding of pRHL rules in BiGKAT
Our goal is now to prove that the rules above are valid in BiGKAT. In this approach, showing that a pRHL rule is sound in BiGKAT will consist in proving that the conjunction of the BiGKAT equations encoding the premises of the pRHL rule implies the equation encoding the conclusion of the rule.
Finally we obtain the main result on the soundness of pRHL rules in BiGKAT.
Theorem 1 (Soundness of pRHL in BiGKAT)
The encoding of pRHL rules (Fig. 3) R-Seq, R-Cond, R-Sub, R-Case and R-Iter displayed in Fig. 4 can be derived by proofs in BiGKAT.
4 Example
In this section we use the framework presented before to reason about invariance features of probabilistic programs.
Consider also a second copy denoted as \(c'\). We prove the invariance of variables \(y, y'\), relational predicate \([y=y']\), over executions of \(c,c'\), which corresponds to the following pRHL judgment \(\vdash c\sim c':[y=y']\Rightarrow [y=y']\). In order to simplify the writing we denote
, \(d_2=b\leftarrow ff\) and \(c_2=(y\leftarrow y\, xor\, b)\), so that \(c=(d_1+_{[x=tt]}d_2)\cdot c_2\). We then use some equational reasoning to obtain in GKAT \((d_1+_{[x=tt]}d_2)\cdot c_2 =_{(7)} (d_1\cdot c_2)+_{[x=tt]}(d_2\cdot c_2)\).
In order to define C the BiGKAT expression showing the analog of the pRHL judgement above, we will distinguish 4 subcases depending on the evaluation of \(\langle [x=tt] \vert [x'=tt] \rangle \): \((1) x=tt, x'=tt\), \( (2) x\ne tt, x'=tt\), \((3) x=tt, x'\ne tt\) and \( (4) x\ne tt, x'\ne tt\).
For that we will define 4 expressions \(C_{i j}\) (\(i=0,1\), \(j=0,1\)) and C as:
and similarly for the other \(C_{ij}\), then we will be able to deduce that \([y=y'] C =[y=y'] C [y=y']\), by using repeatedly axiom (7) for BiGKAT.
We present here the proof of the property for \(C_{11}\) (subcase (1)), the most delicate one, and leave the proofs for the other \(C_{ij}\) to [15].
subcase (1):
Using assumptions \(x=tt\), \(x'=tt\) for the left and right programs we obtain \([x=tt]((d_1\cdot c_2)+_{[x=tt]}(d_2\cdot c_2))=[x=tt](d_1\cdot c_2)\) and \([x'=tt]((d'_1\cdot c'_2)+_{[x'=tt]}(d'_2\cdot c'_2))= [x'=tt](d'_1\cdot c'_2) \). As \(d_1\) and \(d'_1\) contain a sampling we choose to use a coupling in order to obtain a postcondition. We use the constant \(A_s\) defined in Example 6 with its interpretation
.
Denote \(e_1=(y\leftarrow y\, xor\, tt)+_{[b=tt]}1\), \(e'_1=(y'\leftarrow y'\, xor\, tt)+_{[b'=tt]}1\) and let
. By using R-Cond rule (36) one can obtain:
. We thus get
. Moreover we have by semantic hypothesis:
and
, so we obtain
. Now, recall that \(c_2=(y\leftarrow y\, xor\, b)\). We thus have
. So from the two last equalities we deduce:
. So finally:
. This was the property expected for \(C_{11}\).
5 Discussion: comparison between BiGKAT and pRHL
We have seen that BiGKAT is at least as expressive as pRHL without While, since rules of the latter can be encoded in the former. Here we want to illustrate that BiGKAT is in some aspects more expressive than pRHL.
First, BiGKAT allows to derive some rules on pRHL judgements that are valid in the probabilistic model but that cannot be derived within the system pRHL, as defined in [6]. Consider the candidate rule below:
It can be derived in BiGKAT, however it cannot be derived as a sequence of pRHL rules (Fig. 3), simply because if we read any pRHL rule bottom-up the programs in the premises are subterms of the programs in the conclusion. More generally we can derive in BiGKAT:
where premises \(c_1=c_2\) are given by any GKAT axioms. These rules extend in some sense pRHL with rewriting of programs according to GKAT axioms. This might be useful for some usages of pRHL (see for instance [2]).
A second point is that as BiGKAT, contrarily to pRHL, explicitly indicates for couplings the "witnesses" Markov kernels on \(S^2\), it allows to express rules that cannot be written in pRHL. For instance, the following rule, displayed in pRHL format, is (trivially) derivable in BiGKAT:
(40)
However the corresponding candidate pRHL rule is unsound:
As a counter-example consider Example 6 and take
,
, \(\phi =1\), \(\psi _1=[x=x']\), \(\psi _2=[x=\overline{x'}]\). Then \(\psi _1 \wedge \psi _2\) is false, hence the conclusion is not valid. But the BiGKAT rule (40) could not be applied because the two witnesses \(A_s\) and \(A_a\) are not the same C. On this point we plan to study the possible links between BiGKAT and [7].
Finally, a third point is that BiGKAT allows to express judgements with assertions which cannot be expressed by a single pRHL judgement. Consider for example an equality
. It says that, assuming precondition \(\phi \) for the pair of programs \(c_1c_2\) and \(c'_1c'_2\), if moreover after execution of \(\langle c_1 \vert c'_1 \rangle \) the assertion \([x=x']\) holds, then postcondition \(\psi \) is satisfied. This cannot be expressed by a single pRHL judgement.
6 Related work
A seminal approach to relational analysis of programs is due to [10], with the introduction of Relational Hoare logic (RHL). The system takes as the central ingredient the interpretation of program properties as relations over memories, allowing, for example, to prove correctness of program transformations for compiler optimisations. An algebraic approach to this system was taken in [1], by introducing BiKAT, a relational extension of KAT to model relational reasoning on programs, being able to express all-exists properties, i.e. ‘for any run of one program there exists a run of the other such that \(\ldots \)’. It was shown in this paper that both the syntax and the deductive system of RHL, including all-exists corresponding versions [11, 13], can be interpreted in BiKAT.
Probabilistic relational Hoare logic (pRHL), a probabilistic variant of RHL, was introduced by Barthe and coauthors in [6], motivated by the certification of cryptographic proofs. One goal of our approach is to subsume pRHL into algebraic reasoning, taking inspiration from BiKAT. Rather than using KAT, we build our approach on GKAT [22]. The main advantage of this structure for the purpose of this paper is the easier representation of probabilistic programming languages due to the absence of nondeterminism. The work of [22] also introduced the probabilistic model of GKAT based on sub-Markov kernels. The main model of the structure presented in this paper is naturally a relational version of such model. GKAT was also investigated further in [21], which in particular provides a semantics for which the equational theory is complete. The work [14] addressed the application of GKAT to unary (non-relational) properties of probabilistic programs, by investigating the relationships with the probabilistic Hoare logic aHL of [4].
In this work we have considered one probabilistic interpretation of GKAT to model the class of programs we wanted to address. By considering other more complex structures we would be able to increase the set of possible programs to analyse, an therefore capture a greater variety of examples. We could take, for instance, ProbGKAT [20], as our base structure, allowing to represents programs with probabilistic branching.
Another approach to relational reasoning of programs is approximate probabilistic relational Hoare logic (apRHL) [8], for the verification of differential privacy. The term ‘approximate’ refers here to the parameters associated to reasoning on judgments, which are related to the distance between the probabilistic distributions generated by the probabilistic programs.
7 Conclusion and perspectives
In this work we have introduced BiGKAT, a variant of KAT allowing to reason on relational properties of probabilistic programs, based on GKAT, provided a semantics for it based on sub-Markov kernels and a theory allowing to derive proofs. We have illustrated the expressivity of BiGKAT by proving how a subsystem of probabilistic relational Hoare logic [6] (with the while rule replaced by an iteration rule) can be soundly encoded in it.
In future work we want to extend this encoding to the full pRHL, including the while rule. Another interesting path, aligned with what we expressed in the previous section, would be to build a relational version of ProbGKAT, with the goal of extending the set of possible examples to programs with probabilistic branching. Moreover, while usually avoided, and always difficult, one possible direction for future work could be to consider a language with both nondeterminism and probabilities [23], capturing also more application scenarios. That could be indeed another direction to pursue in the future.
For the purpose of this work, we were focused on reasoning about properties of probabilistic non-interference over pairs of programs, i.e. 2-properties. However, we believe that the generalization to a n-relational framework could be possible to handle properties such as n-safety.
Another natural path would be an algebraic approach to subsume approximate probabilistic relational Hoare logic (apRHL) [8], in order to provide an equational way of reasoning on differential privacy. This direction could use the approach of [14] which gives a way to define an approximate version of GKAT.
Acknowledgements
Work partially funded by ANR Project HOPR (ANR-24-CE48-5521-01) from the French National Research Agency, by the project R-TALENT-21-001-BAILLOT from I-Site Université Lille Nord-Europe and by National Funds through FCT - Fundação para a Ciência e a Tecnologia, I.P. (Portuguese Foundation for Science and Technology) within the project IBEX, with reference 10.54499/PTDC/CCI-COM/4280/2021.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Some examples of distributions are the tossing of a fair coin, with probability 0.5 for 0 and 1, and the (discrete version of the) Laplacian distribution \(\mathcal {L}_{p}(a)\) centered in a with parameter p. The density function of \(\mathcal {L}_{p}(a)\) is given by \(\frac{1}{2p}\exp (\frac{|x-a|}{p})\).
Note that technically speaking according to the definition of GKAT the set \(\texttt {T}\) should be chosen finite, which is not the case here, but as observed in [22] Sect. 2.3 Example 2.5 we can use a finite subset \(\texttt {T}'\) of \(\texttt {T}\) for reasoning on pairwise equivalence of programs which terminate.