nach oben

Journal of Cryptology

Erschienen in:

Open Access 01.10.2021

The Design and Evolution of OCB

verfasst von: Ted Krovetz, Phillip Rogaway

Erschienen in: Journal of Cryptology | Ausgabe 4/2021

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

We describe OCB3, the final version of OCB, a blockcipher mode for authenticated encryption (AE). We prove the construction secure, up to the birthday bound, assuming its underlying blockcipher is secure as a strong-PRP. We study the scheme’s software performance, comparing its speed, on multiple platforms, to a variety of other AE schemes. We reflect on the history and development of the mode.

Communicated by Tetsu Iwata.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Schemes for authenticated encryption (AE) symmetrically encrypt a message in a way that ensures both its confidentiality and authenticity. OCB is a well-known algorithm for achieving this aim. It is a blockcipher mode of operation, the blockcipher usually being AES.

There are three main variants of OCB. The first, now called OCB1 (2001) [39], was motivated by Charanjit Jutla’s IAPM [24]. A second version, now called OCB2 (2004) [18, 38], added support for associated data (AD) [37] and redeveloped the mode using the idea of a tweakable blockcipher [30]. OCB2 was recently found to have a disastrous bug [17]. The final version of OCB, called OCB3 (2011) [26], corrected some missteps taken with OCB2 and achieved the best performance yet. It is specified in RFC 7253 [27] and was selected for the CAESAR final portfolio [7].

This paper is about OCB3: its definition, development, security, and software performance. We update the proceedings paper on OCB3 [26], freshening all of the experimental results, expanding the proof, and placing the entire enterprise in context. When we speak of OCB in this paper, we will henceforth mean OCB3.

OCB encryption protects both the confidentiality and authenticity of a plaintext M and, additionally, the authenticity of an associated data A and nonce N. The nonce, a string of 120 or fewer bits, must be unique to each encryption call. OCB does its work using a 128-bit blockcipher E. Encryption needs at most $m+a+2$ blockcipher calls, where $m = \lceil |M|/128\rceil $ is the block size of the plaintext and $a = \lceil |A|/128\rceil $ is the block size of the AD. If the nonce is implemented as a counter and the implementation caches a secret 16-byte value with each message encrypted, then $63/64\approx 98\%$ of the time the number of blockcipher calls can be reduced to $m+a+1$. If the AD is fixed during a session, then after processing it the first time, there is effectively no computational cost for subsequent authentications of it and the addend of a should be ignored. OCB requires a single key K for the underlying blockcipher, and all blockcipher calls are keyed with it. OCB is online in the sense that one need not know the length of A or M to proceed with encryption, nor need one know the length of A or the ciphertext C to proceed with decryption. OCB is fully parallelizable: almost all of its blockcipher calls can be performed simultaneously. Computational work beyond blockcipher calls is restricted to a small number of logical operations per call.

OCB enjoys provable security: the mode is a secure AE scheme, under the standard definition [37, 39], assuming that the underlying blockcipher E is secure as a strong pseudorandom permutation (PRP). As with many modes of operation, security degrades quadratically in the number of blocks processed.

The starting point for all versions of OCB was to insist on a parallelizable design based on a 128-bit blockcipher, presumably AES. We aimed to minimize overhead, understood as any work beyond a single AES-call per input block. As time has gone on, this starting point has only come to seem better. This is because a growing fraction of general-purpose CPUs include hardware support for AES, and the quality of AES acceleration has been steadily improving. On an Intel Cannon Lake processor, we see peak speeds for OCB of about 0.43 cpb (cycles per byte). On an ARM Cortex-A73 processor, we see peak speeds of about 1.14 cpb. These speeds are about 30% and 40% faster, respectively, than what we observe for GCM [10]. Securely encrypting at such speeds on a general-purpose processor might have seemed unrealistic even a decade ago.

We now move on to provide some preliminaries on blockciphers and AE schemes. We then give two equivalent descriptions of OCB—the first based on a conventional blockcipher, like AES, the second based on a tweakable blockcipher [30] constructed from an ordinary blockcipher. Next we describe some of the design choices underlying OCB, trying to sketch how the mechanism came to take its final form. After that, we describe our experimental work, comparing the software performance of OCB and five other AE schemes. Our experiments were carried out on three disparate platforms. Our performance study is limited to software; for hardware studies of CAESAR candidates, including OCB, see the work from Kris Gaj’s lab at GMU [8, 15]. Then comes a proof of security. We conclude with some comments about OCB and its significance, identifying some of the lessons learned along the way.

2 Preliminaries

We begin with a few preliminaries, defining blockciphers, tweakable blockciphers, and AE schemes, and how we will measure the security of each.

Blockciphers. A blockcipher is a function $E\!:\mathcal {K}\times {\{0,1\}}^n\!\rightarrow \!{\{0,1\}}^n$ where the key space $\mathcal {K}$ is a finite nonempty set, the block size $n\!\ge 1\!$ is a number, and $E(K,\cdot )=E_K(\cdot )$ is a permutation for each $K\in \mathcal {K}$. Define $E^{-1}\!: \mathcal {K}\times {\{0,1\}}^n\rightarrow {\{0,1\}}^n$ from E by saying that $E^{-1}(K,Y)=E^{-1}_{K}(Y)$ is the unique point X such that $E(K,X)=Y$.

OCB requires a blockcipher E with a block size of $n=128$ bits. It is possible to define OCB for other block sizes [25], but this paper does not. We usually have in mind that OCB’s blockcipher E is AES (which always has a block size of $n=128$ bits). All of our performance studies assume that E is AES-128, the version of AES with key space $\mathcal {K}={\{0,1\}}^{128}$.

We next define the PRP (pseudorandom permutation) and strong-PRP security for a blockcipher E. With $E\!:\mathcal {K}\times {\{0,1\}}^n\!\rightarrow \!{\{0,1\}}^n$ a blockcipher and $\mathscr {A}$ an adversary (an algorithm with access to one or more oracles), define the strong-PRP security of E as

$$\begin{aligned} {\mathbf \mathbf{Adv}}^{{\smash {\pm \mathrm {prp}}}}_E(\mathscr {A}) =\Pr [ \mathscr {A}^{E_K(\cdot ),\,E^{-1}_K(\cdot )}\Rightarrow 1] -\Pr [ \mathscr {A}^{\pi (\cdot ),\,\pi ^{-1}(\cdot )}\Rightarrow 1] \end{aligned}$$

where $K\twoheadleftarrow \mathcal {K}$ is selected uniformly from $\mathcal {K}$ in the experiment underlying the left-hand addend and $\pi (\cdot )\twoheadleftarrow \mathrm {Perm}({n})$ is selected uniformly from $\mathrm {Perm}({n})$, the set of all permutations on ${\{0,1\}}^n$, in the experiment underlying the right-hand addend. When we write $\mathscr {A}^{\mathcal{O}_1,\mathcal{O}_2,\cdots }\Rightarrow 1$, we mean the event that $\mathscr {A}$ with the specified oracles outputs the bit 1 and halts. Define the (ordinary) PRP-security of E as

$$\begin{aligned} {\mathbf \mathbf{Adv}}^{{\mathrm {prp}}}_E(\mathscr {A}) = \Pr [ \mathscr {A}^{E_K(\cdot )}\Rightarrow 1] -\Pr [\mathscr {A}^{\pi (\cdot )}\Rightarrow 1] \end{aligned}$$

by removing the second oracle in each experiment.

The ideal blockcipher of block size n is the blockcipher $\mathbb {E}^{n}:\mathcal {K}\times {\{0,1\}}^n\rightarrow {\{0,1\}}^n$ where each key $K\in \mathcal {K}$ names a distinct permutation. In this blockcipher, the key space $\mathcal {K}$ has $2^n!$ keys, each corresponding to a table specifying an n-bit permutation.

Tweakable blockciphers. Liskov, Rivest, and Wagner (LRW) put forward the notion of a tweakable blockcipher [30], and they demonstrated that an OCB1-like construction could more easily be designed and proven secure based on this abstraction than a conventional blockcipher. Our work builds on this insight, viewing OCB as based either on a blockcipher or on a tweakable blockcipher (in which case we adjust the name to ${\Theta \text {{CB}} } $).

A tweakable blockcipher (TBC) is a function ${\smash {\widetilde{E}}}\!: \mathcal {K}\times \mathcal{T}\times {\{0,1\}}^n\rightarrow {\{0,1\}}^n$ where the key space $\mathcal {K}$ is a finite nonempty set, or a set that otherwise has an understood distribution on it; the tweak space $\mathcal{T}$ is a nonempty set; and the block size $n\ge 1$ is a number. We insist that ${\smash {\widetilde{E}}}(K,T,\cdot )$ be a permutation on ${\{0,1\}}^n$. Its inverse ${\smash {\widetilde{D}}}={\smash {\widetilde{E}}}^{-1}$ is defined by letting ${\smash {\widetilde{D}}}(K,T,Y)=X$ when ${\smash {\widetilde{E}}}(K,T,X)=Y$. We may write ${\smash {\widetilde{E}}}_K^T(\cdot )={\smash {\widetilde{E}}}(K,T,X)$ and ${\smash {\widetilde{D}}}(K,T,Y)={\smash {\widetilde{D}}}_K^T(Y)$.

For a granular notion of security for the TBC ${\smash {\widetilde{E}}}\!: \mathcal {K}\times \mathcal{T}\times {\{0,1\}}^n\rightarrow {\{0,1\}}^n$, we identify each tweak as either a unidirectional tweak (only forward queries are allowed for it) or as a bidirectional one. (Both forward and backward queries are allowed.) Intuitively, this distinction is useful because, when making a TBC from a conventional blockcipher, unidirectional tweaks only need tweak-dependent “pre-whitening” but bidirectional tweaks need “post-whitening” too [30, 38]. Formally, one names the subset $\mathcal{B}\subseteq \mathcal{T}$ of bidirectional tweaks, understanding that the remaining tweaks $\mathcal{U}=\mathcal{T}\setminus \mathcal{B}$ are unidirectional. We associate with the TBC ${\smash {\widetilde{E}}}\!:\mathcal {K}\times \mathcal{T}\times {\{0,1\}}^n\!\rightarrow \!{\{0,1\}}^n$ and the subset of tweaks $\mathcal{B}\subseteq \mathcal{T}$ the partitioned-tweakspace PRP measure

$$\begin{aligned} {\mathbf \mathbf{Adv}}^{{{\mathcal{B}}\text{- }\mathrm {prp}}}_{{\widetilde{E}}}(\mathscr {A}) =\Pr [ \mathscr {A}^{{\smash {\widetilde{E}}}_K(\cdot ,\cdot ),\,{\smash {\widetilde{D}}}_K(\cdot ,\cdot )}\Rightarrow 1] -\Pr [\mathscr {A}^{\pi (\cdot ,\cdot ),\,\pi ^{-1}(\cdot ,\cdot )}\Rightarrow 1] \end{aligned}$$

where $K\twoheadleftarrow \mathcal {K}$ is chosen uniformly at the beginning of the left-hand experiment, where $\pi \twoheadleftarrow \mathrm {Perm}({\mathcal{T},n})$ is chosen uniformly at the beginning of the right-hand experiment, and where adversary $\mathscr {A}$ may not ask any decryption (second-oracle) query (T, Y) with $T\in \mathcal{T}\setminus \mathcal{B}$. Any such (invalid) query would get empty-string response. By $\mathrm {Perm}({\mathcal{T},n})$ we mean the space of all permutations $\pi (T,\cdot )$ on ${\{0,1\}}^n$, one for each $T\in \mathcal{T}$.

The partitioned-tweakspace PRP notion above unifies tweakable-PRP and strong-tweakable-PRP security: with ${\smash {\widetilde{E}}}\!:\mathcal {K}\times \mathcal{T}\times {\{0,1\}}^n\!\rightarrow \!{\{0,1\}}^n$ a TBC we can resurrect the notion for a strong-tweakable-PRP [30] by defining ${\mathbf \mathbf{Adv}}_{{\widetilde{E}}}^{\smash {\pm \mathrm {prp}}}(\mathscr {A}) = {\mathbf \mathbf{Adv}}_{{\widetilde{E}}}^{{{\mathcal{T}}\text{- }\mathrm {prp}}}(\mathscr {A})$. We can realize the conventional (not strong) version [30] by defining ${\mathbf \mathbf{Adv}}_{{\widetilde{E}}}^{\mathrm {prp}}(\mathscr {A}) = {\mathbf \mathbf{Adv}}_{{\widetilde{E}}}^{{{\emptyset }\text{- }\mathrm {prp}}}(\mathscr {A})$. Going a step further, the PRP and strong-PRP security notions for a conventional blockcipher $E\!:\mathcal {K}\times {\{0,1\}}^n\!\rightarrow \!{\{0,1\}}^n$ can be recovered by regarding the tweak space as a singleton set.

The ideal TBC with tweak space $\mathcal{T}$ and block size n is the TBC ${\smash {\widetilde{\mathbb {E}}}}^{\mathcal{T},n}:\mathcal {K}\times \mathcal{T}\times {\{0,1\}}^n\rightarrow {\{0,1\}}^n$ where each key $K\in \mathcal {K}$ and $T\in \mathcal{T}$ names a distinct permutation ${\smash {\widetilde{\mathbb {E}}}}^{n}(K,T,\cdot )$ on ${\{0,1\}}^n$.

AE schemes. Following traditions rooted in prior work [4, 28, 37, 39], we regard an AE scheme (meaning a nonce-based AE scheme with associated-data, sometimes referred to as an AEAD scheme [37]) as a triple $\Pi =(\Pi .\mathcal {K},\Pi .\mathcal {E},\Pi .\mathcal {D})=(\mathcal {K},\mathcal {E},\mathcal {D})$. The key space $\mathcal {K}$ is a finite nonempty set (or a set that otherwise has an understood distribution). The encryption algorithm $\mathcal {E}$ takes in a key $K\in \mathcal {K}$, a nonce $N\in {\{0,1\}}^*$, a string of associated data $A\in {\{0,1\}}^{*}$, and a plaintext $M\in {\{0,1\}}^*$. It returns, deterministically, a string $C=\mathcal {E}(K,N,A,M)$ that is either a string $C\in \mathcal{C}={\{0,1\}}^*$, the ciphertext, or the symbol ${\bot }$. The former must happen if and only if $(K,N,A,M)\in \mathcal {K}\times \mathcal{N}\times \mathcal{M}\times {\mathcal {A}}$ for nonempty sets $\mathcal{N}$, ${\mathcal {A}}$, and $\mathcal{M}$, the nonce space, AD space, and message space of $\Pi $. The decryption algorithm $\mathcal {D}$ takes in a tuple $(K,N,A,C)$ and deterministically returns a value $\mathcal {D}(K,N,A,C)$ that is either a string M or the distinguished symbol ${\bot }$. The latter must happen if $(K,N,A,C)\not \in \mathcal {K}\times \mathcal{N}\times {\mathcal {A}}\times \mathcal{C}$. We require that $\mathcal {D}(K,N,A,C)=M$ when $C=\mathcal {E}(K,N,A,M)$ is a string. We require of the message space $\mathcal{M}$ that $M\in \mathcal{M}$ implies $M'\in \mathcal{M}$ when $|M'|=|M|$. We require that $|\mathcal {E}(K,N,A,M)|=|\mathcal {E}(K,N,A,M')|=|M|+\tau $ when the encryptions are strings and $|M|=|M'|$. We call $\tau $ the tag length or expansion of the AE scheme. We may write $\mathcal {E}(K,N,A,M)$ as $\mathcal {E}_K(N,A,M)$ or $\mathcal {E}_K^{N,A}(M)$, and similarly for $\mathcal {D}$.

Let $\Pi =(\mathcal {K},\mathcal {E},\mathcal {D})$ be an AE scheme with tag length $\tau $. Given an adversary $\mathscr {A}$, we measure how well it breaks the privacy of (here meaning the confidentiality) of $\Pi $ by the real number

$$\begin{aligned} {\mathbf \mathbf{Adv}}^{\mathrm {priv}}_\Pi (\mathscr {A}) = \Pr [ \mathscr {A}^{\mathcal {E}_K(\cdot ,\cdot ,\cdot )}\Rightarrow 1] - \Pr [\mathscr {A}^{ \$(\cdot ,\cdot ,\cdot )}\Rightarrow 1]\;. \end{aligned}$$

The first addend is the probability that $\mathscr {A}$ outputs 1 when given the oracle $\mathcal {E}_K(\cdot ,\cdot ,\cdot )$. The experiment begins by uniformly selecting a key $K\twoheadleftarrow \mathcal {K}$. Then, when the adversary asks a query (N, A, M), the oracle responds with $\mathcal {E}_K(N,A,M)$. The second term is the probability that $\mathscr {A}$ outputs 1 when given the oracle $\$(\cdot ,\cdot ,\cdot )$. This oracle, on input $(N,A,M)$, returns a uniformly random string of length $|M|+\tau $. We demand that $\mathscr {A}$ never asks two queries with the same first component (the N-value) and that it never asks a query outside of $\mathcal{N}\times \mathcal{A}\times \mathcal{M}$. Any such adversary query is invalid.

Next we define authenticity. With $\Pi $ and $\mathscr {A}$ as before, let

$$\begin{aligned} {\mathbf \mathbf{Adv}}^{\mathrm {auth}}_\Pi (\mathscr {A}) =\Pr [ \mathscr {A}^{\mathcal {E}_K(\cdot ,\cdot ,\cdot )}\;\,\text{ forges}]\;. \end{aligned}$$

Once again the experiment implicitly begins by selecting a key $K\twoheadleftarrow \mathcal {K}$. We say that the adversary forges if it outputs a value $(N,A,C)\in \mathcal{N}\times \mathcal{A}\times \mathcal{C}$ such that $\mathcal {D}(K,N,A,C)\ne {\bot }$ yet there was no prior query (N, A, M) that returned C. We demand that $\mathscr {A}$ never asks two queries with the same N-value (the first component) and never asks a query outside of $\mathcal{N}\times \mathcal{A}\times \mathcal{M}$.

Informally, an AE scheme $\Pi $ is secure if any adversary $\mathscr {A}$ with “reasonable” resources has “small” priv-advantage and small auth-advantage. Following the concrete-security tradition, and in view of the fact that we will only be defining OCB for use with an $n=128$ bit blockcipher, we refrain from making any absolute or asymptotic definition of security.

An alternative definition of AE security merges privacy and authenticity to get a unified security measure

$$\begin{aligned} {\mathbf \mathbf{Adv}}^{\mathrm {ae}}_\Pi (\mathscr {A}) =\Pr [\mathscr {A}^{\mathcal {E}_K(\cdot ,\cdot ,\cdot ),\,\mathcal {D}_K(\cdot ,\cdot ,\cdot )}] - \Pr [\mathscr {A}^{\$(\cdot ,\cdot ,\cdot ),\, \bot (\cdot ,\cdot ,\cdot )}]\;. \end{aligned}$$

The adversary is given either a pair of “real” encryption and decryption oracles or, alternatively, a pair of “fake” encryption and decryption oracles. The former begin by choosing a random $K\twoheadleftarrow \mathcal {K}$. After, the first oracle (the encryption oracle) responds to a query (N, A, M) with $\mathcal {E}(K,N,A,M)$, while the second oracle (the decryption oracle) responds to a query (N, A, C) with $\mathcal {D}(K,N,A,C)$. The $\$(\cdot ,\cdot ,\cdot )$ oracle, on query (N, A, M), just returns $|M|+\tau $ uniformly random bits, where $\tau $ is the tag length of $\Pi $. The $\bot (\cdot ,\cdot ,\cdot )$ oracle, on query (N, A, C), always returns $\bot $. The adversary may not repeat the first component, N, to its encryption oracle; it may not ask a decryption query of (N, A, C) following an encryption query of (N, A, M) that returned C; and it may only ask encryption queries in $\mathcal{N}\times {\mathcal {A}}\times \mathcal{M}$ and decryption queries in $\mathcal{N}\times {\mathcal {A}}\times \mathcal{C}$.

The ae-security notion is essentially equivalent to the conjunction of priv-security and auth-security [41, Section 7]. In our proof of OCB’s security, we will use one direction of this claim.

We say that AE schemes $\Pi =(\mathcal {K},\mathcal {E},\mathcal {D})$ and $\Pi '=(\mathcal {K}',\mathcal {E}',\mathcal {D}')$ coincide if their encryption and decryption algorithms compute identical functions: $\mathcal {E}(K,N,A,M)=\mathcal {E}'(K,N,A,M)$ and $\mathcal {D}(K,N,A,C)=\mathcal {D}'(K,N,A,C)$ for all K, N, A, M, and C. This implies $\mathcal {K}=\mathcal {K}'$, too.

We note that our priv and ae security notions demand indistinguishability from random bits (sometimes denoted IND$) instead of indistinguishability from the encryption of random (or fixed) bits. This notion goes back to OCB1 [39]. There are several reasons for it. First, this notion of privacy is stronger, yet is achieved by OCB (and most other real-world AE schemes). At the same time, it is also what is most convenient to prove. Also, the encryption algorithm of an AE scheme meeting IND$-security can be used as a pseudorandom generator (PRG), which might be convenient, in some cases, for the user. Perhaps most significantly, IND$-security implies a form of anonymity [1] where one demands that ciphertexts be indistinguishable whether they are constructed from one hidden key K or are, instead, constructed using either of two hidden keys $K_0,K_1$, which one being selected by the adversary.

3 Specification 1

We now define OCB (meaning OCB3). Given a blockcipher $E{:\;}\mathcal {K}\times {\{0,1\}}^{128}\rightarrow {\{0,1\}}^{128}$ and a tag length $\tau \in [0{..}128]$, Fig. 1 defines from E and $\tau $ the AE scheme $\Pi = {\text {OCB} }[E,\tau ] = (\mathcal {K},\mathcal {E},\mathcal {D})$. The nonce space $\mathcal{N}=\mathcal{N}_{\text {OCB} }={\{0,1\}}^{\le 120}$ is the set of all binary strings with at most 120 bits. The message space $\mathcal{M}={\{0,1\}}^*$ and AD-space ${\mathcal {A}}={\{0,1\}}^*$ are all binary strings. The $\mathrm {Setup}$ procedure of Fig. 1 is implicitly run on or before the first call to $\mathcal {E}$ or $\mathcal {D}$. The variables it defines are understood to be global. While line 114 is written as an infinite loop, it is sufficient to compute $L[0],L[1],\ldots , L[d]$ where $2^d$ upper bounds the number of 128-bit blocks in any plaintext M, ciphertext C, and associated data A. We write ${\mathrm {ntz}({i})}$ for the number of trailing zeros in the binary representation of positive integer i (e.g., ${\mathrm {ntz}({1})}\!=\!{\mathrm {ntz}({3})}\!=\!0$ and ${\mathrm {ntz}({4})}\!=\!2$). We write ${\mathrm {msb}}(X)$ for the first (most significant) bit of the string X. We write $A\wedge B$ for the bitwise-and of the equal-length strings A and B. We write $A{\ll }i$ for the shift of A by i positions to the left (maintaining string length, leftmost bits falling off, zero-bits entering at the right). We write either $A\; B$ or $A\;\Vert \;B$ for the concatenation of strings A and B. At line 121 we write $[a]_7$ for $a\in [0..127]$ encoded as a 7-bit string, while at line 126 we regard the variable $\mathrm {Bottom}$ as a number instead of a string.

For reasons of generality, OCB is defined to operate on bit strings. But for reasons of simplicity and efficiency, most implementations will assume that all strings operated on are strings of bytes, with a byte being eight bits.

Figure 2 illustrates encryption under OCB. Function $\text {Inc}$ implicitly depends on the key K, with $\text {Inc}_i(\Delta ) \!=\! \Delta \oplus L[{\mathrm {ntz}({i})}]$, $\text {Inc}_\$(\Delta )\! =\! \Delta \oplus L_{\$}$, $\text {Inc}_*(\Delta ) \!=\! \Delta \oplus L_*$, and the (zero-argument) $\text {Init}\!=\! 0^{128}$. Here $L_*\!=\! E_K(0^{128})$, $L_{\$}= 2\cdot L_*\!=\! \mathrm {double}({L_*})$, and $L[i] = 2^{2+i}\cdot L_*$ for all $i\ge 0$, the multiplication in ${{\smash {{\mathrm {GF}}(2^{128})}}}$. Here we write $2=0^{126}10 = \texttt {x}$ to denote a particular nonzero point of the finite field. The diagrams should be read left to right and then top to bottom: first set $\Delta $ to be $\text {Init}_K(N)$; then modify $\Delta $ with $\text {Inc}_1$; then compute $C_1$ using the blockcipher E with pre- and post-whitening with $\Delta $; then modify $\Delta $ with $\text {Inc}_2$; and so on.

4 Specification 2

In this section, we generalize OCB by describing the construction in terms of a tweakable blockcipher (TBC). We show that this alternative description really is a generalization of OCB by explaining how to recover the definition of OCB with a particular choice for its TBC.

4.1 The TBC-Based Scheme

We can view OCB as an AE scheme built not from a blockcipher but from a tweakable blockcipher (TBC) [30], as defined and discussed in Section 2. Refer to Fig. 3, which defines the TBC-based version of OCB, which we name ${\Theta \text {{CB}} } $. It depends on a tweakable blockcipher ${\smash {\widetilde{E}}}\!:\mathcal {K}\times \mathcal{T}_{\Theta \text {{CB}} } \times {\{0,1\}}^n\rightarrow {\{0,1\}}^n$ with $n=128$ and a tag length $\tau \in [0..128]$. The tweak space for ${\smash {\widetilde{E}}}$ is the rather complicated set

$$\begin{aligned} \mathcal{T}_{\Theta \text {{CB}} } = \mathcal{N}\!\times \! {\mathbb {N}_1}\,\cup \, \mathcal{N}\!\times \! \mathbb {N}_0\!\times \! \{*\} \,\cup \, \mathcal{N}\!\times \! \mathbb {N}_0\!\times \! \{\$\} \,\cup \, \mathcal{N}\!\times \! \mathbb {N}_0\!\times \! \{*\$\} \,\cup \, {\mathbb {N}_1}\,\cup \, \mathbb {N}_0\!\times \! \{*\} \end{aligned}$$

where $\mathcal{N}=\mathcal{N}_{\text {OCB} }={\{0,1\}}^{\le 120}$ is the nonce space for OCB and ${\mathbb {N}_1}$ and $\mathbb {N}_0$ are the positive and nonnegative integers, respectively. For the TBC-based version of OCB, it is not actually necessary that $n=128$, and the set $\mathcal{T}_{\Theta \text {{CB}} } $ could just as well be defined from any nonce space $\mathcal{N}$.

Tweaks, it can be seen, are of six mutually exclusive types. Those of the first type are in the set $\mathcal{B}= \mathcal{N}\!\times \!{\mathbb {N}_1}$ of bidirectional tweaks. Omitting parenthesis and commas when writing tweaks, calls to the TBC ${\smash {\widetilde{E}}}$ will look like ${\smash {\widetilde{E}}}_K^{{\,}N{\,}i}(X)$, ${\smash {\widetilde{E}}}_K^{{\,}N{\,}i{\,}*}(X)$, ${\smash {\widetilde{E}}}_K^{{\,}N{\,}i{\,}\$}(X)$, ${\smash {\widetilde{E}}}_K^{{\,}N{\,}i{\,}* {\,}\$}(X)$, ${\smash {\widetilde{E}}}_K^{{\,}i}(X)$, or ${\smash {\widetilde{E}}}_K^{{\,}i{\,}*}(X)$.

With ${\smash {\widetilde{E}}}\!:\mathcal {K}\times \mathcal{T}_{\Theta \text {{CB}} } \times {\{0,1\}}^n\rightarrow {\{0,1\}}^n$ and $\tau \in [0..n]$ the AE scheme $\Pi = {\Theta \text {{CB}} } [{\smash {\widetilde{E}}},\tau ]=(\mathcal {K},\mathcal {E},\mathcal {D})$ continues to use the nonce space $\mathcal{N}_{\text {OCB} }$ of OCB, its message space is $\mathcal{M}={\{0,1\}}^*$, its AD space is ${\mathcal {A}}=\{0,1\}^*$, and its ciphertext space is $\mathcal{C}={\{0,1\}}^*$ (actual ciphertexts are, as before, in ${\{0,1\}}^{\ge \tau }$).

Encryption under $\Theta $CB is illustrated in Fig. 4. The encryption process is conceptually simpler than it is under OCB because the offsets (the $\Delta $-values) of Fig. 2 are now gone, replaced by the tweaks that, conceptually, identify unrelated permutations $E_K^T$ and $E_K^{T'}$ for distinct tweaks $T,T'$.

4.2 Instantiating the TBC

We can instantiate the TBC of ${\Theta \text {{CB}} } [{\smash {\widetilde{E}}},\tau ]$ in a way that makes the scheme coincide with ${\text {OCB} }[E,\tau ]$. With $n=128$ and $\mathcal{N}= \mathcal{N}_{\text {OCB} }= \{0,1\}^{\le 120}$, map the blockcipher $E{:\;}\mathcal {K}\times {\{0,1\}}^{n}\!\rightarrow \!{\{0,1\}}^{n}$ to a TBC ${\smash {\widetilde{E}}}= \text {Tw} [E,\tau ]$ with signature ${\smash {\widetilde{E}}}{:\;}\mathcal {K}\times \mathcal{T}_{\Theta \text {{CB}} } \times {\{0,1\}}^{n}\!\rightarrow \!{\{0,1\}}^{n}$. The construction is specified in Fig. 5. There, multiplication is in ${{\smash {{\mathrm {GF}}(2^{128})}}}$ using the irreducible polynomial $\mathtt {x}^{128} \!+\! \mathtt {x}^7 \!+\! \mathtt {x}^2 \!+\! \mathtt {x}\!+\! 1$. We do not distinguish between strings and what they represent in this finite field.

The code of Fig. 5 is more understandable with the following background. The sequence of values $\gamma _0,\gamma _1,\gamma _2,\ldots $ employed is called a Gray code. We give a couple of well-known facts about Gray codes. First, that $\gamma :\mathbb {N}_0\rightarrow \mathbb {N}_0$ is a permutation. In fact, it is a permutation on $\mathbb {Z}_{2^i}$ for each i. Thus, we also have that $0\le \gamma _i\le 2i$ for all i. It follows from these facts and the definition of our $\lambda $-values that all of the values $\Lambda = \{\lambda _i, \; \lambda _j^*, \; \lambda _j^\$, \; \lambda _j^{*\$}:\; 1\le i\le 2^{120}, \;0\le j\le 2^{120}\}$ are distinct and nonzero. This is the only thing we will actually need of these values.

The reader can now check that $\text {OCB} [E,\tau ]={\Theta \text {{CB}} } [\text {Tw} [E,\tau ],\tau ]$—we have defined $\text {Tw} $ to make that so—a fact important enough to record as a lemma:

Lemma 1

For any blockcipher $E\!:\mathcal {K}\times {\{0,1\}}^n\rightarrow {\{0,1\}}^n$ and number $\tau \in [0..n]$, the construction ${\text {OCB} }[E,\tau ]$ coincides with ${\Theta \text {{CB}} } [\text {Tw} [E,\tau ],\tau ]$.

5 Evolution

We now explain some of the design choices made for OCB, simultaneously sketching the mode’s history.

Development of ocb1. The initial reason for developing OCB1 [39] was to improve the efficiency of authenticated encryption. Prior work by Jutla [24], and also by Katz and Yung [28] and Gligor and Donescu [11], had evidenced that tightly integrating privacy and authenticity let one get by with less work than what one would see from generically composing separate privacy and authenticity mechanisms. Attending to “low-level” concerns, we aimed to create a blockcipher-based AE scheme that would push the efficiency bar. OCB1 handles plaintexts of any length and produces ciphertexts $\tau $-bits longer. It uses a near-minimal number of blockcipher calls and minimizes work beyond those calls. It allows nearly all blockcipher calls to be performed in parallel. It assumes a nonce, not a random IV. It is deterministic. A single key underlies all blockcipher calls. It is online, making a single pass over the plaintext or ciphertext, whose length may not be known in advance.

Development of ocb2. A series of devastating attacks on 802.11 (WiFi) security created an urgent need to replace its confidentiality mechanism. Jesse Walker, Nancy Cam-Winget, and others advocated replacing it with OCB1. But they explained to Rogaway that there would need to be some way to authenticate but not encrypt header information. Formalizing and achieving this end led to the idea of associated data and to the paper by Rogaway that introduced it [37].

A couple years later, a different paper by Rogaway [38] proposed adjusting the offsets used in OCB1, going from the Gray-code ordered $L, 3L, 2L, 6L, 7L, 5L, \ldots $ to the powering-up ordered $L, 2L, 2^2L, 2^3L, 2^4L, \dots $. OCB2 does this, and also accommodates the AD. It was adopted as an ISO standard in 2009 [18]. As mentioned already, the scheme has a major error [17].

ocb1’s speed questioned. In 2004 McGrew and Viega wrote a paper claiming that GCM [10] was about as fast, if not faster, than OCB1 [31]. They suggested that two design issues underlie this. First, OCB1 uses $m+2$ blockcipher calls to encrypt a message of $m=\lceil |M|/128\rceil $ blocks. In contrast, GCM makes do with $m+1$. Second, OCB1 twice needs the result of one AES computation before another AES computation can proceed. Both in hardware and in software, this can degrade performance. Still, we suspected that these two matters would have only a minor impact on performance. The timing studies were misleading, we believed, because McGrew and Viega had compared a reference implementation of OCB with an optimized implementation of GCM. Even that was irreproducible due to McGrew and Viega’s use of proprietary GCM code.

We decided to revise OCB1/OCB2 to address and assess the two efficiency concerns above. We integrated scheme development and performance measurements. The result of our work was the definition of OCB3 and a comparative study of the performance of CCM [9], GCM, and the OCB triplet [26]. We found that, across a variety of platforms, all versions of OCB were considerably faster than both CCM and GCM. The performance differences among OCB1, OCB2, and OCB3 were small. OCB3 had the best performance, but OCB2’s performance was actually worse than OCB1’s. We now look at some of the OCB3 changes over OCB1/OCB2.

Reduced latency. Suppose that the message $M=M_1\cdots M_m\, M_*$ being encrypted is not a multiple of 128 bits: it has a final block $M_*$ of 1–127 bits. With OCB1 and OCB2 one makes an $M_m$-dependent blockcipher call to compute a ciphertext block $C_m$ that is in turn used to create a Checksum that is enciphered with another blockcipher call. This can result in a pipeline stall [31]: one blockcipher call must finish before the next one can begin. For OCB3 we restructure the algorithm so that the Checksum does not depend on any ciphertext block: we simplify the Checksum to $M_1\oplus M_2\oplus \cdots \oplus M_{m}$ when there is a full final block $M_m$, and $M_1\oplus M_2\oplus \cdots \oplus M_{m}\oplus M_*10^*$ otherwise.

Incrementing offsets. In OCB1, each noninitial offset is computed from the prior one by xoring in some key-derived value: the ith offset is $\Delta \!\leftarrow \!\Delta \oplus L[{\mathrm {ntz}({i})}]$. In OCB2, each noninitial offset is computed from the prior one by multiplying it, in ${{\smash {{\mathrm {GF}}(2^{128})}}}$, by a constant: $\Delta \!\leftarrow \! (\Delta {\ll }1)\oplus ({\mathrm {msb}}(\Delta )\cdot \text {135})$, an operation we think of as doubling. The first approach turns out to be faster [39]. While doubling can be coded in five Intel x86-64 assembly instructions, it still runs more slowly. In some settings, doubling loses big: it is expensive on 32-bit machines, and some compilers do poorly at turning C/C++ code for doubling into machine code that exploits the available instructions. On Intel x86, the 128-bit SSE registers cannot be efficiently shifted one position to the left. Finally, the doubling operation is not endian neutral: if we must create a bit pattern in memory to match the sequence generated by doubling, we will effectively favor big-endian architectures. We can trade this bias for a little-endian one by redefining $\mathrm {double}({})$ to include a byteswap. But one is still favoring one endian convention over the other, and not just at key-setup time. For all of these reasons, OCB3 reverts to the OCB1 method incrementing offsets.

In the development of OCB3, we tried a variety of other methods to generate the needed offsets, exploring word-oriented LFSRs (linear feedback shift registers) of various designs [39, Appendix B]. No alternative performed significantly better, even in isolation, than the method used in OCB1.

Trimming a blockcipher call. OCB1 and OCB2 took $m+2$ blockcipher calls to encrypt an m-block string M: one to map the nonce N into an initial offset $\Delta $; one for each block of M; and one to encipher the final Checksum. The first of these is easy to eliminate if one is willing to replace a blockcipher call by, say, $K_1\cdot N$, the product in ${{\smash {{\mathrm {GF}}(2^{128})}}}$ of nonce N and a variant $K_1$ of K. But such a change would necessitate implementing a ${{\smash {{\mathrm {GF}}(2^{128})}}}$ multiply. We use a different method to trim the extra blockcipher call, most of the time, for a counter-based nonce: we employ a new xor-universal hash function, which we call the stretch-then-shift hash. The initial offset $\Delta $ is $\Delta = H_K(N) = (\mathrm {Stretch}{\ll }\mathrm {Bottom})[1{..}128]$ where $\mathrm {Bottom}$ is the last six bits of N and the $(128\!+\!64)$-bit string $\mathrm {Stretch}$ is made by a process involving enciphering N with its last six bits zeroed out. (This description ignores the inclusion of the tag length $\tau $ along with N.) The technique ensures that when the nonce N is a counter, the initial offset $\Delta $ can be computed without a new blockcipher call about $63/64\approx 98\%$ of the time. In this way we reduce cost from $m+2$ blockcipher calls to an amortized $m\!+\!1.02$ blockcipher calls. Note, however, that achieving this reduction requires an implementation to cache the value of Ktop (line 124) and to notice, when true, that it need not be recomputed. And there are security considerations in doing this, for it is essential that the adversary not have access to the cached value.

Incorporating tag length. The algorithm of this paper coincides with that of RFC 7253 [27] but differs from the algorithm in the proceedings paper [26] in one minor way: the tag length is used in computing the initial offset $\Delta $ (line 121). A change in this direction was requested during the RFC review process in order that a ciphertext created for one value of the tag length $\tau $ should not be of value in forging a ciphertext for a different value of $\tau $. The change is heuristic, as security claims associated with OCB assume $\tau $ to be fixed. Subsequent work by Reyhanitabar, Vaudenay, and Vizár demonstrated that the manner in which we incorporate $\tau $ does not work to improve the security notion to one where the same key is used for multiple tag lengths [36].

Representing field points. A low-level choice where OCB and GCM [10] part ways is in the representation of field elements. In GCM the polynomial $a_{127}\mathtt {x}^{127}+\cdots a_1\mathtt {x}+ a_0$ corresponds to the string $a_{0}\ldots a_{127}$ rather than the string $a_{127}\ldots a_{0}$. The usual convention on machines, whether big-endian and little-endian, is that the msb is the leftmost bit of any register. One advantage of following that convention is that a left shift by one can be implemented by adding a register to itself, an operation that is sometimes faster than a 1-bit logical shift. For example, on an ARM Cortex-A57 addition has one cycle latency but logical shift has two.

Incremental APIs. Unlike OCB1 and OCB2, under OCB3, each 128-bit block of plaintext is processed in the same way whether or not it is the final 128 bits. This change facilitates implementing an incremental API, as one is able to output each 128-bit chunk of ciphertext after receiving the corresponding chunk of plaintext, even if it is not yet known if the plaintext is over. Similarly, each 128-bit block of AD is now treated the same way whether it is or is not the message’s end, simplifying the incremental provisioning of it.

Endian issues. While we expect the majority of processors running OCB will be little-endian, the mode’s definition does nothing to favor this convention. Endian issues arise when “register oriented” and “memory oriented” values interact. These are the same on big-endian machines, but are opposite on little-endian ones. One could, therefore, favor little-endian machines by building into the algorithm byte swaps that mimic those that would occur naturally each time memory and register oriented data interact. We experimentally adapted our implementation to do this but found that it made very little performance difference. One reason for this is the efficient byte-reversal capability on most modern processors (e.g., the pshufb instruction can reverse 16 bytes on x86 machines in a single cycle). Also, the OCB1-based approach for incrementing offsets allows precomputed values to be endian-adjusted at key-setup time, removing most endian-dependency for subsequent encryption or decryption calls. Since it makes little difference to performance, and big-endian specifications are conceptually easier, OCB does not make any gestures toward little-endian orientation.

6 Performance

Figures 6, 7, and 8 compare the performance of OCB and several other AE schemes. The OCB alternatives were chosen based on their popularity and salience. The TLS 1.3 specification [35] mentions three AE algorithms—GCM [10, 31], CCM [9, 43] and Chacha20-Poly1305 [5, 32]. These schemes are well established and probably represent the bulk of AE in use today. So we begin with these three schemes. Then, in the recently concluded CAESAR competition, AEGIS-128 [44] and OCB were winners of the “high-performance” category. So we include AEGIS-128 as a fourth point of comparison. Finally, on the 64-bit Intel architecture we include AES-GCM-SIV, as specified in RFC 8452 [13] (2018). The AES-GCM-SIV mechanism builds on Rogaway and Shrimpton’s SIV [41] to add nonce-reuse misuse-resistance, the most important security property, in our view, that is not delivered by OCB or the other alternatives we have named. AES-GCM-SIV is benchmarked only on the 64-bit Intel architecture because, at the time of writing, that is the only high-quality implementation available.

We ran experiments on three different architectures. The most important architectures today are 64-bit Intel x86 for servers, desktops and laptops, and 64-bit ARM for high-performance embedded applications such as smartphones and tablets. We therefore benchmark on CPUs fitting these descriptions. For 64-bit Intel we use a recent Intel Core i3-8121U (“Cannon Lake”). For 64-bit ARM we test on a high-performance Cortex-A73. As a catch-all for architectures not falling into either of these categories, we also benchmark on an older 32-bit ARMv5 processor (Marvell 88FR131 “Feroceon”). The Feroceon results capture how much work each algorithm needs when the unit of work is a 32-bit RISC assembly instruction and the processor lacks hardware acceleration such as an AES unit or vector registers. The Feroceon is also a reasonable stand-in for what can be expected from ARM’s current M-series embedded processors.

Source code. A problem sometimes seen in performance comparisons is to include implementations of uneven quality. To avoid this, we considered for benchmarking the best-performing version of each scheme among several sources. OpenSSL, Libgcrypt, and WolfSSL all provide open-source cryptographic libraries with significant assembly-language acceleration targeting multiple architectures. For each algorithm implemented by each library, we identified its peak speed on multiple architectures. In every case OpenSSL’s library was the fastest. This is not surprising: OpenSSL has a longer record of producing good-quality assembly than the others. As a sanity check, whenever possible we also verified that speeds reported on the SUPERCOP benchmarking website [42] were not faster that the speeds we found.

The AEGIS-128 implementation we used for benchmarking is a modified version of the optimized one placed on the SUPERCOP benchmarking website by the AEGIS designers. The AES-GCM-SIV implementation we used is from Google’s BoringSSL library. We ourselves have supplied the OCB code. The AEGIS-128 and OCB codes were written in C (with AES intrinsics when available) and all other AE algorithms have their kernels written in assembly. All non-library code was compiled with both Clang 8.0 and GCC 8.3 or 9.1, using both optimization levels 2 and 3, and whichever combination performed best is reported. The OpenSSL and BoringSSL libraries were built using the library’s build script with the addition of architecture-specific flags. We found that changing which compiler or optimization level was used to compile the libraries made no measurable difference in our tests, no doubt because the time-critical code was already in assembly, thus unaffected by compiler options.

Discussion. Having AES in hardware is the most significant determinant of performance. On our 64-bit ARM and 64-bit Intel processors—both of which have hardware AES support as well as support for GCM’s “carryless” multiplication—OCB, AEGIS, and GCM are fastest. On these architectures, OCB and AEGIS have similar performance profiles, while GCM encrypts 2KB messages 40–60% slower. Chacha20-Poly1305 and CCM have peak speeds 3–7 times slower than OCB and AEGIS-128. On systems without AES in hardware, Chacha20-Poly1305 is the clear winner. On our Feroceon system Chacha20-Poly1305 encrypts 2KB messages 18% faster than AEGIS-128, twice as fast as OCB, and over three times faster than CCM and GCM.

The relative speeds of these algorithms are easy to explain. OCB and AEGIS-128 were designed to have little marginal overhead beyond calls to the AES round function. On systems with a hardware AES unit, this results in outstanding peak performance. Although CCM and GCM can also take advantage of a hardware AES unit, the computation of GHASH for GCM and the CBC-MAC for CCM, in addition to the counter-mode encryption each performs, is enough to separate them from the speed leaders. Chacha20-Poly1305 does not use AES or carryless multiplication, instead depending on integer addition, rotation, and xor; it gets no benefit from AES hardware support. Chacha20-Poly1305 is designed to benefit from vectorized instructions, like AVX on Intel and NEON on ARM, but these do not have enough accelerative power to keep up with the AES unit’s capabilities.

On a system without an AES unit, we see that AEGIS-128 is about twice as fast as OCB (because it computes about half as many AES rounds, per byte of input, as OCB), and OCB is about twice as fast as CCM (because it computes about half as many AES rounds as CCM). GCM is somewhere between OCB and CCM because GCM substitutes a less expensive GHASH computation for CCM’s CBC-MAC. The simplicity of the Chacha20-Poly1305 design leads to its superior performance on simpler systems.

Effect of parallelism. On systems with an AES unit in hardware, parallelism has a profound effect. OCB can take advantage of any amount of AES parallelism. If a system is able to compute, say, 32 AES calls in parallel, then OCB computation can be arranged to loop and process 32 blocks per iteration. This means that OCB’s performance depends primarily on how many AES computations can be done in parallel and how long each takes to complete. On Intel’s Cannon Lake architecture, its AES unit can issue two AES-rounds per cycle and they take five cycles to complete. This means that if the pipeline can be kept full, it can complete ten rounds of processing every five cycles. Since each 16 bytes of plaintext requires at least ten rounds of AES processing, this implies a maximum throughput of five cycles per 16 bytes, or 0.31 cpb. On Intel’s Skylake architecture, where only one AES round can be issued per cycle, it will take at least ten cycles per 16 bytes, so 0.62 cpb. OCB achieves this lower-bound speed on Skylake but not on Cannon Lake. There is enough slack in the functional units during Skylake’s computation to absorb the computational overhead beyond AES computations, whereas on Cannon Lake there is not.

Looking toward newer architectures, Intel is currently rolling out its AVX-512 architecture. Our Cannon Lake testbed has some previously developed 512-bit vector instructions, which are used by Chacha20-Poly1305, but Intel’s Ice Lake architecture brings more significant improvements. Its AES unit doubles throughput and shaves 25% off latency which will benefit OCB much more than the other AE algorithms. AEGIS-128 has a serial component—it cannot start the first AES round of an input block until the first AES round of the prior block has completed—implying that its performance is limited by AES round latency. This should result in a 25% speed increase between Cannon Lake and Ice Lake architectures, but that is less than the potential doubling of OCB speed. GCM will see a doubling in AES throughput, but carryless multiplication will maintain its throughput, limiting any performance increase. CCM’s serial CBC-MAC computation will be 25% faster on Ice Lake than on Cannon Lake, but will continue to keep CCM’s performance slowest of the group. Chacha20-Poly1305’s AVX-512 speed improvements are already reflected in this study and should see no significant improvement on Ice Lake. Preliminary results on an Ice Lake system, completed as we go to press, show an OCB peak throughput of 0.19 cpb, which is very close to the predicted doubling.

Testing methodology. The number of CPU cycles needed to encrypt a message is divided by the length of the message to arrive at the cost per byte to encrypt messages of that length. This is done for every message length from 1 to 2048 bytes. So as not to have performance results overly influenced by the memory subsystem of a host computer, we arrange for all code and data to be in level-1 cache before timing begins. Two timing strategies are used: the clock function of C and the x86 time-stamp counter. In the clock version, the ANSI C clock() function is called before and after repeatedly encrypting the same message, on sequential nonces, for a little more than one second. The clock difference determines how many CPU cycles were spent on average per processed byte. This method is highly portable, but is time-consuming when collecting an entire dataset. On x86 machines there is a “time-stamp counter” (TSC) that increments once per CPU cycle. To capture the average cost of encryption, the TSC is used to time encryption of the same message 64 times on successive counter-based nonces. The TSC method is not portable, working only on x86, but it is fast. The TSC read instruction might be executed out of order, in some cases it has high latency, and it continues counting when other processes run. To reduce the impact of these problems, we read the TSC once before and after encrypting the same message 65 times and then read the TSC once before and after encrypting the same message once more. Subtracting the second timing from the first gives us the cost for encrypting the message 64 times, and mitigates the out-of-order and latency problems. To avoid including context-switches, we run experiments multiple times and keep only the median timing.

7 Security

The key steps for proving OCB secure are establishing the security of the $\Theta $CB construction when based on an idealized TBC, and then showing that our way of creating a tweakable blockcipher, the $\text {Tw} $-construction, works. Both steps are in the information-theoretic setting. Passing to the complexity-theoretic setting for the final result is then standard.

7.1 Security of the TBC-Based Scheme

The following lemma establishes the priv-security and auth-security of $\Theta $CB when based on an ideal TBC.

Lemma 2

${\mathbf \mathbf{Adv}}^{\mathrm {priv}}_{\Pi }(\mathscr {A}_{\mathrm {priv}}) =0$ for any adversary $\mathscr {A}_{\mathrm {priv}}$, while
$ {\mathbf \mathbf{Adv}}^{\mathrm {auth}}_{\Pi }(\mathscr {A}_{\mathrm {auth}}) \le 2^{n-\tau }/(2^n-1) \le 1.1/ 2^\tau $ for any adversary $\mathscr {A}_{\mathrm {auth}}$.

Note that in this idealized setting, privacy and authenticity do not degrade with message lengths, AD lengths, or the number of queries: privacy is perfect, while the chance of forging is about $1/2^\tau $.

Proof

Consulting Fig. 4 will make the proof more understandable. Keep in mind that, in the idealized setting we consider, ${\smash {\widetilde{E}}}_K^{N{\,}i}$, ${\smash {\widetilde{E}}}_K^{N{\,}i{\,}\$}$, ${\smash {\widetilde{E}}}_K^{N{\,}i{\,}*}$, ${\smash {\widetilde{E}}}_K^{N{\,}i{\,}*{\,}\$}$, ${\smash {\widetilde{E}}}_K^{{\,}i}$ and ${\smash {\widetilde{E}}}_K^{{\,}i{\,}*}$ are random permutations on n bits (for all permissible N and i). To emphasize this, we write $\pi $ in place of ${\smash {\widetilde{E}}}_K$, so $\pi ^{N{\,}i}$, $\pi ^{N{\,}i{\,}*}$, $\pi ^{N{\,}i{\,}\$}$, $\pi ^{N{\,}i{\,}*{\,}\$}$, $\pi ^{{\,}i}$ and $\pi ^{{\,}i{\,}*}$. Any two such permutations are independent of one another as long as their superscripts are distinct. We must argue privacy and authenticity.

Privacy. During the adversary’s attack it asks queries $(N^1,A^1,M^1),\ldots ,(N^q,A^q,M^q)$ and gets responses ${\mathscr {C}}^1,\ldots , {\mathscr {C}}^q$ where ${\mathscr {C}}^j=C^j\,\Vert \,T^j$ and $C^j = C^j_1\cdots C^j_{m^j}$ or $C^j = C^j_1\cdots C^j_{m^j}C^j_*$. In general, in this proof we will use a superscript of j on a variables to indicate the value of that variable following the adversary’s jth query. We may do this with any variable appearing in Fig. 4.

Since the $N^j$-values must be distinct, each permutation of the form $\pi ^{N^j\cdots }$ is used at most once. We are thus applying independent random permutations to a single point each, so all of these outputs are uniformly random and independent. The bulk of what is returned to the adversary, all the $C^j_i$ values, are such $\pi ^{N^j{\,}i}$ outputs. But the $C_*^j$ and $T^j$ values are not just permutation outputs but are, instead, $\pi ^{N^j{\,}m^j{\,}*}$ or $\pi ^{N^j{\,}m^j{\,}\$}$ or $\pi ^{N^j{\,}m^j{\,}*{\,}\$}$ outputs that are xored with $M^j_*\,0^*$ and truncated to $|M^j_*|$ bits, or, alternatively, are xored with $\mathrm {Auth}^j$ and then truncated to $\tau $ bits. Either way, the result remains uniform and independent of all other outputs, as $\mathrm {Auth}^j$, $M^j_*$, and $\tau $ are independent of $\pi ^{N^j\cdots }$ outputs. We conclude that each result from the adversary’s $j^\mathrm {th}$ query is a uniformly random string ${\mathscr {C}}^j$ of length $|M^j|+\tau $ that is independent of all other query responses. It follows that the adversary’s privacy advantage is zero.

Authenticity. Before we launch into proving authenticity, consider the following simple game, which we call game G. Suppose that you know that an n-bit string $\mathrm {Tag}$ is not some particular value $\mathrm {Tag}_0$. All of the $2^n-1$ other possibilities are equally likely. Then, your chance of correctly predicting the $\tau $-bit prefix of $\mathrm {Tag}$ is at most $2^{n-\tau }/ (2^n-1)$. That is because the best strategy is to guess any $\tau $-bit string other than the $\tau $-bit prefix of $\mathrm {Tag}_0$. The probability of being right under this strategy is $2^{n-\tau }/ (2^n-1)$. We will repeatedly use this fact in the sequel.

Now suppose that the adversary asks a sequence of queries $(N^1,A^1,M^1),\ldots , (N^q,A^q,M^q)$ and then makes its forgery attempt $(N,A,{\mathscr {C}})$, where ${\mathscr {C}}= C\,\Vert \,T$, $C=C_1\cdots C_c$ or $C=C_1\cdots C_c C_*$. We need to bound the probability that $(N,A,{\mathscr {C}})$ is a successful forgery.

We begin by considering the case where the forgery attempt $(N,A,{\mathscr {C}})$ employs a nonce N that is not among $\{N^1,\ldots ,N^q\}$. Then to forge the adversary needs to find the correct value of $T=(\pi ^{N\cdots }({\mathrm {Checksum}})\oplus \mathrm {Auth})[1..\tau ]$ but has seen nothing that depends on $\pi ^{N\cdots }$. Having no relevant information to use, the adversary’s chance of success is clearly $2^{-\tau }$, which is less than $2^{n-\tau }/(2^n-1)$.

Given the above, we can assume that the forgery attempt $(N,A,{\mathscr {C}})$ uses a nonce $N=N^j$ that coincides with a nonce from some prior query. All the other queries had to employ different nonces and thereby generated information theoretically unrelated to the adversary’s task at hand. We can therefore disregard these other encryption queries and assess the maximal forging probability for a new and simpler game: the adversary asks a single encryption query $(N,A',M')$, getting a response ${\mathscr {C}}'=C'\,\Vert \,T'$, and then it tries to forge, with the same nonce, outputting a triple $(N,A,{\mathscr {C}})$. We must bound the probability that such a forgery is valid. We write $A'=A'_1\cdots A'_{a'}$ or $A'=A'_1\cdots A'_{a'} \,A'_*$, $M'=M'_1\cdots M'_{m'}$ or $M'=M'_1\cdots M'_{m'}\, M'_*$, $A =A_1\cdots A_{a}$ or $A=A_1\cdots A_{a} \,A_*$, ${\mathscr {C}}= C\,\Vert \,T$, $C = C_1\cdots C_c$ or $C = C_1\cdots C_c\, C_*$. We proceed by case analysis, relating the form of the forgery attempt $(N,A,{\mathscr {C}})$ to the encryption query $(N,A',M')$. We say that a string is full if it has a multiple of n bits, and partial otherwise.

Case 1: $A\ne A'$, $A\ne \varepsilon \ne A'$. We have several subcases to consider. Case 1a. If $A'$ is full and A is partial, then $\mathrm {Auth}'\,\Vert \,\mathrm {Auth}$ will be $U_{2n}$ distributed, the uniform distribution on 2n bits, and the adversary will only be able to forge with probability $2^{-\tau }$. The uniformity of $\mathrm {Auth}'\,\Vert \,\mathrm {Auth}$ is because $\mathrm {Auth}'$ will depend on a $\pi ^a$ output but no $\pi ^{a{\,}*}$ output, while $\mathrm {Auth}$ will depend on a $\pi ^{a{\,}*}$ output. The case when $A'$ is partial and $A'$ is full is similar. Case 1b. If $A'$ and A are full and $a\ne a'$, then $\mathrm {Auth}'\,\Vert \,\mathrm {Auth}$ will again be $U_{2n}$ distributed, whence the forging probability will be $2^{-\tau }$. The uniformity of $\mathrm {Auth}' \,\Vert \,\mathrm {Auth}$ is because each includes a $\pi ^1$ output in the xor but only one xors in a $\pi ^{\max (a,a')}$ output. The case of $A'$ and A being partial but $a\ne a'$ is analogous. Case 1c. Suppose $A'=A'_1\ldots A'_a$ and $A =A_1 \ldots A_a$ are full, equal in length, but distinct; say $A'_j\ne A_j$ for some j. Then, even if the adversary were given not only ${\mathscr {C}}'$ but also $\mathrm {Final}$, all $\pi ^i(A'_i)$ values, and all $\pi ^i(A_i)$ values except for $i=j$, still the value of $\mathrm {Auth}= \oplus _i\, \pi ^i(A_i) = Z \oplus \pi ^j(A_j)$, and therefore $\mathrm {Tag}$, would be uniform in a space of $2^n-1$ possible values. We are now in the setting of game G, and the adversary’s chance to predict T, the first $\tau $ bits of $\mathrm {Tag}$, is at most $2^{n-\tau }/(2^n-1)$. Case 1d. The last case is when $A'=A'_1\ldots A'_a A'_*$ and $A =A_1 \ldots A_a A_*$ are partial, equal in length, but distinct. If $A'_j\ne A_j$ for some j, then proceed as with Case 1c. If $A'_*\ne A_*$, then $A'_*10^*\ne A_*10^*$ and proceed as with Case 1c, but with $\pi ^{a{\,}*}(A_*10^*)$ the unknown value.

Case 2: $A\ne A'$, $A=\varepsilon $ or $A'=\varepsilon $. If $A'=\varepsilon \ne A$, then $\mathrm {Auth}'=0^n$, while $\mathrm {Auth}$ is $U_n$ distributed and independent of $\mathrm {Final}$, so $\mathrm {Tag}$ is uniform and independent of all values the adversary has observed, whence its ability to predict its $\tau $-bit prefix is at most $2^{-\tau }$. If $A'\ne \varepsilon =A$ then $\mathrm {Auth}=0^n$ and the adversary must output the $\tau $-bit prefix of $\mathrm {Final}$. It has received no information on any $\pi ^{N{\,}i {\,}\$}$ or $\pi ^{N{\,}i {\,}* {\,}\$}$ from its encryption query, as the only such output was masked with $\mathrm {Auth}'$, formed using a $\pi ^{a'}$ or $\pi ^{a'{\,}*}$ output. So the forging probability is again $2^{-\tau }$.

Case 3: $A=A'$; $M'$ partial and C full, or the other way around. For this and all remaining cases, we imagine providing the adversary with $\mathrm {Auth}'=\mathrm {Auth}$. So the adversary must predict the first $\tau $ bits of $\mathrm {Final}$. But $\mathrm {Final}$ is the output of a random permutation $\pi ^{N{\,}c {\,}\$}$ that has not yet been applied to any point. With no relevant information for carrying out this task, its chance to forge is at most $2^{-\tau }$. For the case with $M'$ full and C partial, $\mathrm {Final}$ is, similarly, an output $\pi ^{N{\,}c {\,}*{\,}\$}$ where this permutation has not yet been applied to any point.

Case 4: $A=A'$; $c\ne m'$; $M'$ and C both full or both partial. the adversary must predict the first $\tau $ bits of $\mathrm {Final}$. But $\mathrm {Final}$ is the output of a random permutation $\pi ^{N{\,}c {\,}\$}$ (if $M'$ and C both full) that has not yet been used at even one point. With no relevant information, the adversary forges with probability at most $2^{-\tau }$. Similarly when $M'$ and C are both partial.

Case 5: $A=A'$, $c=m'$, $M'$ and C full. For the forgery to be valid we must have $C'\ne C$ so let j to be the smallest value where $C'_j\ne C_j$. Assume the adversary is given the value of $\mathrm {Auth}'=\mathrm {Auth}$, all $\pi ^{N{\,}i}$ permutations where $i\ne j$, and all $\pi ^{\cdots \$}$ permutations. Then the adversary will know ${\mathrm {Checksum}}'$ and all the values but one that xored together make ${\mathrm {Checksum}}$. But it will be missing one addend in the xor, and all $2^n-1$ values other than $M_j$ are equally likely for it, whence ${\mathrm {Checksum}}$ can assume $2^n-1$ values, each equiprobable, and $\mathrm {Final}$ can assume $2^n-1$ values, each equiprobable. We are in the setting of game G and the adversary’s chance to forge is at most $2^{n-\tau }/(2^n-1)$.

Case 6: $A=A'$, $c=m$, $M'$ and C partial. Because $C\ne C'$ either there is a smallest j where $C'_j\ne C_j$ or else $C'_1\cdots C'_c=C_1\cdots C_c$ but $C'_*\ne C_*$. For the former case, proceed like Case 5. For the latter case the adversary may be able to compute ${\mathrm {Checksum}}' = M'_1\oplus \cdots M'_c \oplus M'_*10^*$ but also ${\mathrm {Checksum}}= M'_1\oplus \cdots M'_c \oplus M_*10^*$. These two values are necessarily distinct, due to the $10^*$ padding. Even if the adversary is given both and knows $\mathrm {Final}'$ the image of ${\mathrm {Checksum}}$ under

https://static-content.springer.com/image/art%3A10.1007%2Fs00145-021-09399-8/MediaObjects/145_2021_9399_IEq514_HTML.gif

will be a uniformly random among the $2^n-1$ points that are not $\mathrm {Final}'$. We are again left in the situation of game G, and the adversary’s probability to forge is at most $2^{n-\tau }/(2^n-1)$. This completes the proof.

$\square $

7.2 Stretch-then-Shift Hash

OCB employs a new hash function, “Init” in Fig. 3, to map the nonce N to the initial offset $\Delta $. We aimed to construct a hash that would need at most one AES computation no matter what N-values are presented, but would usually reuse a previously computed AES output, plus a couple of shifts and xors, if N is one more than it was last time around. Our method applies AES to the top 122 bits of N to create an initial value Ktop, but then modifies Ktop in a simple way using the lower six bits of N, call them Bottom. Specifically, we return the first 128 bits of $H_\mathrm {Ktop}(\mathrm {Bottom}) = (\mathrm {Ktop}\,\Vert \,(\mathrm {Ktop}\oplus (\mathrm {Ktop}\ll 8))) \ll \mathrm {Bottom}$. There is no deep intuition motivating the formula; it is just the simplest method we found that we could prove to work. We start with the needed definitions.

Definition. Let $\mathcal {K}$ be a finite set and let $H{:\;}\mathcal {K}\times \mathcal{X}\rightarrow {\{0,1\}}^n$ be a function. We say that H is strongly xor-universal if for all distinct $x,x'\in \mathcal{X}$ we have that $H_K(x) \oplus H_K(x')$ is uniformly distributed in ${\{0,1\}}^{n}$ and, also, $H_K(x)$ is uniformly distributed in ${\{0,1\}}^n$ for all $x\in \mathcal{X}$. The first requirement is the usual definition for H being xor-universal and the second we call universal-1.

The technique. We aim to construct strongly xor-universal hash-functions $H{:\;}\mathcal {K}\times \mathcal{X}\rightarrow {\{0,1\}}^n$ where $\mathcal {K}={\{0,1\}}^{128}$, $\mathcal{X}=[0{..}\mathrm {domSize}-1]$, and $n=128$. We want $\mathrm {domSize}$ to be at least some modest-size number, say $\mathrm {domSize}\ge 64$, and intend that computing $H_K(x)$ be extremely fast. Computation of H should not require large tables, preprocessing of K, or special hardware support.

The method we propose is to stretch the key K into a longer string $\textit{stretch}(K)$ and then extract its bits $x+1$ to $x+128$. Symbolically, $H_K(x)= (\textit{stretch}(K))[x+1\;{..}\; x+128]$ where $S[a{..}b]$ denotes bits a through b of S, indexing beginning with 1. Equivalently, $H_K(x)=(\textit{stretch}(K){\ll }x)[1{..}128]$. We call this a stretch-then-shift hash.

How to stretch K? It seems natural to have $\textit{stretch}(K)$ begin with K, so let us assume that $\textit{stretch}(K) = K\;\Vert \;s(K)$ for some function s. It is easy to see that $s(K)\!=\! K$ and $s(K){\ll }c$ will not work, but $s(K) = K \oplus (K {\ll }c)$, for some constant c, looks plausible. We now demonstrate that, for well-chosen c, this function does the job.

Analysis. To review, we are considering the hash functions $H^c_K(x) = (\mathrm {Stretch}{\ll }x)[1{..}128]$ where $\mathrm {Stretch}=\textit{stretch}(K)=K \;\Vert \;(K\oplus (K{\ll }c))$ and $c\in [0{..}127]$. We would like to know the maximal value of $\mathrm {domSize}$ for which $H_K(x)$ is xor-universal on the domain $\mathcal{X}=[0{..}\mathrm {domSize}(c)\!-\!1]$. This can be calculated by a computer program, as we now explain. Fix c and consider the $256 \times 128$ entry matrix $A=\left( \!\!\begin{array}{c} I\\ J \end{array}\!\! \right) $ where I is the $128\times 128$ identity matrix and J is the $128\times 128$-bit matrix for which $J_{ij}=1$ iff $j=i$ or $j= i+c$. Let $A_i$ denote the $128\times 128$ submatrix of A that includes only A’s rows i to $i+127$. Then $H^c_K(x) = A_{x+1} K$, the product in ${{\smash {{\mathrm {GF}}(2)}}}$ of the matrix $A_{i+1}$ and the column vector K. Let $B_{i,j} = A_i + A_j$ be the indicated $128\times 128$ matrix, the matrix sum over ${{\smash {{\mathrm {GF}}(2)}}}$. We would like to ensure that, for arbitrary $0\le i<j<\mathrm {domSize}(c)$ and a uniform $K\in {\{0,1\}}^{128}$ that the 128-bit string $H^c_K(i) + H^c_K(j)$ is uniform—which is to say that $A_{i+1} K + A_{j+1} K = (A_{i+1} + A_{j+1}) K = B_{i+1,j+1}K $ is uniform. This will be true if and only if $B_{i,j}$ is invertible in ${{\smash {{\mathrm {GF}}(2)}}}$ for all $1\le i<j\le \mathrm {domSize}(c)$. Thus $\mathrm {domSize}(c)$ can be computed as the largest number $\mathrm {domSize}(j)$ such that $B_{i,j}$ is full rank, over ${{\smash {{\mathrm {GF}}(2)}}}$, for all $1\le i<j\le \mathrm {domSize}(j)$. Recalling the universal-1 property we also demand that $A_{i}$ have full rank for all $1\le i\le \mathrm {domSize}(c)$. Now for any c, the number of matrices $A_{i,j}$ to consider is at most $2^{13}$, and finding the rank in ${{\smash {{\mathrm {GF}}(2)}}}$ of that many $128\times 128$ matrices is not a difficult calculation.

Our results are tabulated in Fig. 9. The most interesting cases are $H^5$ and $H^8$, which are strongly xor-universal on $\mathcal{X}=[0{..}123]$ and $\mathcal{X}=[0{..}84]$, respectively. We offer no explanation for why these functions do well and various other $H^c$ do not. As both $H^5$ and $H^8$ work on $[0{..}63]$ we select the latter map for use in OCB and single out the following result:

Lemma 3

Let $H{:\;}{\{0,1\}}^{128}\times [0{..}63]\rightarrow {\{0,1\}}^{128}$ be defined by $H_K(x) = (\mathrm {Stretch}{\ll }x)[1{..}128]$ where $\mathrm {Stretch}= K \;\Vert \;(K\oplus (K{\ll }8))$. Then, H is strongly xor-universal.

Efficiency. On 64-bit computers, assuming $K \;\Vert \;(K\oplus (K{\ll }8))$ is precomputed and in memory, the value of $H_K(x)$ can be computed by three memory loads and two multiprecision shifts, requiring fewer than ten cycles on most architectures. If only K is in memory then the first 64 bits of $K\oplus (K{\ll }8)$ can be computed with three additional assembly instructions.

7.3 Instantiating the TBC

We show that ${\smash {\widetilde{E}}}\!=\!\text {Tw} [E,\tau ]$ is a good TBC if E is a good blockcipher. In formalizing this, bidirectional tweaks are those of the form (N, i). Our result is again, for now, in the information-theoretic setting.

Lemma 4

Fix $n=128$ and $\tau \in [0..n]$ and $E=\mathbb {E}^{n}$. Let ${\smash {\widetilde{E}}}= \text {Tw} [E,\tau ]$ and $\mathcal{B}= \mathcal{N}\times {\mathbb {N}_1}$. Let $\mathscr {A}$ be an adversary that asks at most q queries, all with i-value less than $2^{120}$. Then ${\mathbf \mathbf{Adv}}^{{{\mathcal{B}}\text{- }\mathrm {prp}}}_{{\widetilde{E}}}(\mathcal{A}) \le 6q^2/2^n$.

Proof

We generalize the adversary’s capabilities in attacking $\text {Tw} [E,\tau ]$; see Fig. 10 for the construction we will call $\text {TW} $. There we write $\pi $ instead ${{\widetilde{E}}}_K$. The adversary, which we still refer to as $\mathscr {A}$, may now ask queries we will refer to as being of TYPE-1, TYPE-2, TYPE-3a, TYPE-3b. In other words, the adversary’s queries may take any of the forms $(1, W,\lambda )$, $(2, W,\mathrm {Top},\mathrm {Bottom},\lambda )$, $(\text{3a },W,\mathrm {Top},\mathrm {Bottom},\lambda )$, or $(\text{3b },Z,\mathrm {Top},\mathrm {Bottom},\lambda )$. We insist that the adversary not ask a query with $\mathrm {Top}\!=\!0$ (we stop to distinguish field points and the corresponding strings) and we demand that any $\lambda \in {{\smash {{\mathrm {GF}}(2^n)}}}$ asked in a query is used only for queries of one numeric TYPE (it is fine to use the same $\lambda $ in queries of TYPE-3a and 3b). The adversary may not repeat queries nor ask a query with a trivially known answer (a TYPE-3b query following the corresponding TYPE-3a query, or the other way around). Working in ${{\smash {{\mathrm {GF}}(2^n)}}}$, we sometimes write xor as addition.

As the adversary asks its queries the mechanism makes what we will call internal queries to the random-permutation $\pi $. For example, the adversary’s TYPE-1 query of $(W,\lambda )$ results in an internal type-1 query of X. The internal queries come in two flavors, direct and indirect, as show in Fig. 5. Note that the total number of internal queries resulting from the adversary’s q queries is at most $\sigma \!=\!2q\!+\!1$. The hash function H that we use to compute $\mathrm {Initial}$ is the map defined and proven secure in Lemma 3. That said, any strongly xor-universal hash function with the needed domain and range will do. It is important to understand that all of the abilities present in a “real” adversary attacking $\text {Tw} [E,\tau ]$ are also represented in the abilities of an adversary attacking $\text {TW} [E,\tau ]$ we have now described.

We aim to show $\mathscr {A}$ will get small advantage in attacking $\text {TW} $. The proof involves a game-playing argument followed by a case analysis of some collision probabilities. We begin with a game 1 that perfectly simulates the $\text {TW} $-construction. As the adversary $\mathcal{A}$ asks queries the game grows the permutation $\pi $ in the usual way, preparing each input for $\pi $ or $\pi ^{-1}$ exactly as would $\text {TW} $. The responses to type-A and B queries are stored and looked up as needed. In game 2 we return, in response to each internal query $\pi $ or $\pi ^{-1}$, a freshly minted uniformly random point of ${{\smash {{\mathrm {GF}}(2^n)}}}$. Note that this results in values returned to the adversary that are, likewise, uniformly random. In game 3 we perfectly simulate an ideal tweakable blockcipher ${\widetilde{\pi }}$ (with the right domain and tweak space). By the “switching lemma” the advantage of $\mathcal{A}$ in distinguishing games 2 and 3 is at most $0.5\,q(q\!-\!1)/2^n$, so we must only bound the gap between games 1 and 2.

In game 1, consider answering each internal query by uniformly sampling from ${\{0,1\}}^n$ and, hopefully, returning that sample. If we have already used our speculatively generated return value set a flag $\textit{bad}$ and re-sample from the co-range (for $\pi $-queries) or co-domain (for $\pi ^{-1}$ queries). The above $\textit{bad}$-setting events occur with probability at most $0.5\,\sigma (\sigma \!-\!1)/2^n$.

When an internal query clashes with any prior commitment made then, to accurately play game 1, we must answer the query according to the prior commitment. Assume we do so, and then set $\textit{bad}$. Call these $\textit{bad}$-setting events collisions. We can write games 1 and 2 so as to be identical until $\textit{bad}$ is set, so we have only to bound the probability of collisions in game 2, the version where we uniformly sample for responding to internal queries. Note that game 2 maintains the invariant that values returned to the adversary are independent of the values L and $\mathrm {Initial}$ selected internally. Because of this, we can simplify the temporal aspect of the game and replace it by an alternative one in which the adversary chooses all TBC queries, and their responses, at the beginning. Then we make the indirect queries that determine L and $\mathrm {Initial}$, and determine if a collision has occurred. Excising adaptivity in such a manner has been illustrated in much prior work.

Any potential collision event—e.g., the $20^{\mathrm {th}}$ internal query colliding with the $6^{\text {th}}$—can be summarized by writing something like ${\mathsf {Coll}(\mathrm {3a}, \mathrm {1}) }$, interpreted as saying that first there was a type-3a internal query $(W,\mathrm {Top},\mathrm {Bottom},\lambda )$, which generated a $\pi $-input of X (its value to be determined) and an adversarially-known response Z, and then the adversary asked a type-1 query of $(W',\lambda ')$, which gave rise to a $\pi $-input of $X'$ (value to be determined), and an adversarially-known response of $Y'$. Now we make the underlying type-A and type-B queries and it so happens that $X\!=\!X'$. Such an event is unlikely since it implies that $W + \mathrm {Initial}+ \lambda L = W' + \lambda ' L$, and $\mathrm {Initial}$ is uniform and independent of all other values named in the formula: we select the type-B output $\mathrm {Ktop}$ of $\pi $ uniformly at random, and H is universal-1, making $H_\mathrm {Ktop}(\mathrm {Bottom}) $ uniform, too. The probability of the event happening, for a given pair of indirect queries, is at most $2^{-n}$. The same holds for each of the 36 possible collision types. To avoid tedious repetition, we provide a few examples. We continue to use the same convention as in the last example, priming variables for the second query.

$\Pr [{\mathsf {Coll}(\mathrm {A}, \mathrm {B}) }]=\Pr [{\mathsf {Coll}(\mathrm {B}, \mathrm {A}) }]=0$ since the adversary is not allowed to query with $\mathrm {Top}=0$.
$\Pr [{\mathsf {Coll}(\mathrm {3a}, \mathrm {3a}) }] = \Pr [W+\mathrm {Initial}+\lambda L = W' + \mathrm {Initial}' + \lambda ' L]$. Queries may not be repeated, so $(W,\mathrm {Top},\mathrm {Bottom},\lambda )\ne (W',\mathrm {Top}',\mathrm {Bottom},\lambda ')$. Suppose $\mathrm {Top}\ne \mathrm {Top}'$. Then $\mathrm {Ktop}$ and $\mathrm {Ktop}'$ are uniform and independent, making $\mathrm {Initial}$ and $\mathrm {Initial}'$ uniform and independent by the universal-1 property of H, so the probability in question is at most $2^{-n}$. Suppose $\mathrm {Top}=\mathrm {Top}'$ but $\mathrm {Bottom}\ne \mathrm {Bottom}'$. Then $\mathrm {Ktop}=\mathrm {Ktop}'$ is uniform and, by the xor-universality of H, variables $\mathrm {Initial}$ and $\mathrm {Initial}'$ are uniform and independent of each other and of every other variables appearing in the formula, making the probability in question at most $2^{-n}$. If $\mathrm {Top}=\mathrm {Top}'$ and $\mathrm {Bottom}=\mathrm {Bottom}'$ and $\lambda =\lambda '$ then we know that $W\ne W'$ and the probability of collision is 0. Finally, if $\mathrm {Top}=\mathrm {Top}'$ and $\mathrm {Bottom}=\mathrm {Bottom}'$ and $W=W'$ then we know that $\lambda \ne \lambda '$ and the probability we are considering collapses to $\Pr [\lambda L = \lambda ' L] = \Pr [cL=0]$ where $c=\lambda -\lambda '\ne 0$. Since L was chosen uniformly at random in the type-A query, only the choice of $L=0$ results in a collision, which happens with probability $2^{-n}$.
$\Pr [{\mathsf {Coll}(\mathrm {2}, \mathrm {3b}) }] = \Pr [Y = \mathrm {Initial}'+\lambda ' L]$. This is at most $2^{-n}$ as, for example, $\mathrm {Initial}'$ is uniform and independent of Y, $\lambda '$, and L.
$\Pr [{\mathsf {Coll}(\mathrm {3b}, \mathrm {2}) }] = \Pr [W +\mathrm {Initial}+\lambda L = W'+ \mathrm {Initial}' + \lambda ' L$. Since $\lambda $-values must be distinct between TYPE-2 and TYPE-3b the probability is $\Pr [cL = W + \mathrm {Initial}+ W'+ \mathrm {Initial}']$ for some $c\ne 0$. Since the RHS side is independent of L and L is uniform, the probability is at most $2^{-n}$.

Continuing in this way one finds that each type of collision occurs with probability at most $2^{-n}$, implying a probability for any collision of at most $0.5\,\sigma (\sigma \!-\!1)/2^n$. Summing with the addends of $0.5\,\sigma (\sigma \!-\!1)$ and $0.5\,q(q\!-\!1)$ and recalling that $\sigma \le 2q+1$ we conclude that the total adversarial advantage is at most $4.5q^2 + 1.5q \le 6q^2$, completing the proof. $\square $

7.4 Wrapping Up

The following concretizes a claim, by now folklore, from Rogaway and Shrimpton [41, Section 7]. It is one half of the equivalence between ae-security and priv+auth security: an AE scheme that is secure in the priv and auth senses is secure under the all-in-one definition, too. We include a proof, as we don’t know where else in the literature one appears.

Lemma 5

Let $\Pi $ be an AE scheme. Let $\mathscr {A}$ be an adversary that asks at most $q_{{\mathrm {d}}}$ decryption queries. Then there are adversaries $\mathscr {A}_{\mathrm {priv}}$ and $\mathscr {A}_{\mathrm {auth}}$ such that

$$\begin{aligned} {\mathbf \mathbf{Adv}}^{\mathrm {ae}}_\Pi (\mathscr {A})\le {\mathbf \mathbf{Adv}}^{\mathrm {priv}}_\Pi (\mathscr {A}_{\mathrm {priv}}) + q_{{\mathrm {d}}}\cdot {\mathbf \mathbf{Adv}}^{\mathrm {auth}}_\Pi (\mathscr {A}_{\mathrm {auth}})\;. \end{aligned}$$

Adversaries $\mathscr {A}_{\mathrm {priv}}$ and $\mathscr {A}_{\mathrm {auth}}$ are explicitly constructed from $\mathscr {A}$ in the proof of this lemma. They have time and query complexity similar to that of $\mathscr {A}$.

Proof

Adversary $\mathscr {A}_{\mathrm {priv}}$ is constructed as follows. It runs adversary $\mathscr {A}$. When $\mathscr {A}$ makes a left-oracle query of (N, A, M) adversary $\mathscr {A}_{\mathrm {priv}}$ asks its own oracle (N, A, M) and gets a response C, which it returns to $\mathscr {A}$. When $\mathscr {A}$ makes a right-oracle query of (N, A, C), it returns $\bot $ to $\mathscr {A}$. When $\mathscr {A}$ halts with an output of b, adversary $\mathscr {A}_{\mathrm {priv}}$ outputs the same bit b and halts.

Adversary $\mathscr {A}_{\mathrm {auth}}$ is constructed as follows. It selects a random $j\twoheadleftarrow [1..q_{{\mathrm {d}}}]$. It then runs adversary $\mathscr {A}$. When $\mathscr {A}$ makes a left-oracle query of (N, A, M) adversary $\mathscr {A}_{\mathrm {auth}}$ asks its own oracle (N, A, M) and gets a response C, which it returns to $\mathscr {A}$. When $\mathscr {A}$ makes its ith right-oracle query of (N, A, C), adversary $\mathscr {A}_{\mathrm {auth}}$ returns $\bot $ if $i\ne j$, while, if $i=j$, it outputs (N, A, C) (as its attempted forgery) and then halts.

Omitting oracle-argument placeholders $(\cdot ,\cdot ,\cdot )$ for improved readability, we have that

$$\begin{aligned} {\mathbf \mathbf{Adv}}^{\mathrm {ae}}_\Pi (\mathscr {A})= & {} \Pr [\mathscr {A}^{\mathcal {E}_K,\mathcal {D}_K}\Rightarrow 1]- \Pr [\mathscr {A}^{\$, \bot } \Rightarrow 1]\\= & {} (\Pr [\mathscr {A}^{\mathcal {E}_K, \mathcal {D}_K}\Rightarrow 1]- \Pr [\mathscr {A}^{\mathcal {E}_K, \bot }\Rightarrow 1])\\&+ (\Pr [\mathscr {A}^{\mathcal {E}_K,\bot }\Rightarrow 1]- \Pr [\mathscr {A}^{\$, \bot }\Rightarrow 1])\\\le & {} \Pr [\mathsf {SomeForge}] + \mathrm {Adv}_\Pi ^{\mathrm {priv}}(\mathscr {A}_{\mathrm {priv}}) \\\le & {} \sum _{i=1}^{q_{{\mathrm {d}}}}\Pr [\mathsf {FirstForge}_i] + \mathrm {Adv}_\Pi ^{\mathrm {priv}}(\mathscr {A}_{\mathrm {priv}}) \\\le & {} q_{{\mathrm {d}}}\cdot \Pr [\mathsf {FirstForge}] + \mathrm {Adv}_\Pi ^{\mathrm {priv}}(\mathscr {A}_{\mathrm {priv}}) \end{aligned}$$

where $\mathsf {SomeForge}$ is the event that adversary $\mathscr {A}$, in the course of its execution with oracles $\mathcal {E}_K,\mathcal {D}_K$, asks its decryption oracle some query that results in a non-$\bot $ return value; where $\mathsf {FirstForge}_i$ is the event that $\mathscr {A}$, in the course of its execution with oracles $\mathcal {E}_K,\mathcal {D}_K$, gets its first non-$\bot $ oracle response on query-i; and where $\mathsf {FirstForge}$ is the event that $\mathscr {A}$, in the course of its execution with oracles $\mathcal {E}_K,\mathcal {D}_K$, gets its first non-$\bot $ return value in response to its jth query, for j uniform in $[1..q_{{\mathrm {d}}}]$. Our construction of $\mathscr {A}_{\mathrm {auth}}$ ensures that a decryption query by $\mathscr {A}$ that gets a first non-$\bot $ response always corresponds to a successful forgery by the adversary $\mathscr {A}_{\mathrm {auth}}$ we have constructed; that is, $ {\mathbf \mathbf{Adv}}^{\mathrm {auth}}_\Pi (\mathscr {A}_{\mathrm {auth}}) = \Pr [\mathscr {A}_{\mathrm {auth}}^{\mathcal {E}_K} \text{ forges}] \ge \Pr [\mathsf {FirstForge}]$. This completes the proof. $\square $

We now have all the needed elements to conclude the security of OCB. First, combining Lemmas 2 and 5 immediately establishes the ae-security of ${\Theta \text {{CB}} } $ based on an idealized TBC.

Lemma 6

Fix $n=128$ and $\tau \in [0..n]$ and ${\smash {\widetilde{\mathbb {E}}}}={\smash {\widetilde{\mathbb {E}}}}^{\mathcal{T}_{\Theta \text {{CB}} },\;n}$. Let $\Pi ={\Theta \text {{CB}} } [{\smash {\widetilde{\mathbb {E}}}},\tau ]$. Let $\mathscr {A}$ be an adversary that asks at most $q_{{\mathrm {d}}}$ decryption queries. Then ${\mathbf \mathbf{Adv}}^{\mathrm {ae}}_{\Pi }(\mathscr {A}) \le 1.001\,q_{{\mathrm {d}}}/2^\tau $.

Next we replace the idealized TBC of ${\Theta \text {{CB}} } [{\smash {\widetilde{\mathbb {E}}}},\tau ]$ by the TBC one gets from the Tw construction applied to an idealized (but not tweakable) blockcipher. Combining Lemmas 4 and 6 then gives the following. In the theorem statement, the number of blocks asked by an adversary refers to the total number of n-bit plaintext or ciphertext blocks, any fractional final blocks counting as one block, plus the number of associated-data blocks, any fractional final blocks again counting as one block.

Lemma 7

Fix $n=128$ and $\tau \in [0..n]$ and $\mathbb {E}=\mathbb {E}^n$. Let $\Pi ={\text {OCB} }[\mathbb {E},\tau ]$. Let $\mathscr {A}$ be an adversary that asks at most q queries, at most $q_{{\mathrm {d}}}$ of them decryption queries, and at most $\sigma $ blocks. Then ${\mathbf \mathbf{Adv}}^{\mathrm {ae}}_{\Pi }(\mathscr {A}) \le 6(\sigma +2q) ^2/2^n + 1.1\,q_{{\mathrm {d}}}/2^\tau $.

Proof

We construct an adversary $\mathscr {D}$ as follows. It has an oracle for a TBC taking tweaks $T\in \mathcal{T}_{\Theta \text {{CB}} } $ and inputs $X\in {\{0,1\}}^n$; and another oracle for the inverse of this TBC, taking tweaks $T\in \mathcal{T}_{\Theta \text {{CB}} } $ and inputs $Y\in {\{0,1\}}^n$. Adversary $\mathscr {D}$ runs adversary $\mathscr {A}$. When $\mathscr {A}$ makes an encryption query of (N, A, M) adversary $\mathscr {D}$ services this query by encrypting M with respect to N and A using the ${\Theta \text {{CB}} } $ algorithm but employing its first oracle in lieu of all forward-direction blockcipher calls. When $\mathscr {A}$ makes a decryption query of (N, A, C) adversary $\mathscr {D}$ services it by decrypting C with respect to N and A using the ${\Theta \text {{CB}} } $ algorithm but employing its first oracle for forward-direction blockcipher calls and its second oracle for inverse-direction blockcipher calls. Note that all calls made by $\mathscr {D}$ use tweaks from $\mathcal{T}_{\Theta \text {{CB}} } $ and that inverse calls will only use tweaks of the form (N, i). We may assume that all calls employ i-values of less than $2^{64}$, for, otherwise, $\sigma \ge 2^{64}$ and the result is vacuous. The total number of queries asked by $\mathscr {D}$ will be at most $\sigma +2q$, the 2q accounting for the fact that there is one TBC call for the Checksum and one TBC call for the processing of the nonce to produce the initial offset. Now

$$\begin{aligned} {\mathbf \mathbf{Adv}}_\Pi ^{\mathrm {ae}}(\mathscr {A})= & {} \Pr [\mathscr {A}^{{\text {OCB} }[\mathbb {E},\tau ].\mathcal {E},\,{\text {OCB} }[\mathbb {E},\tau ].\mathcal {D}}\Rightarrow 1] - \Pr [\mathscr {A}^{\$,\,\bot }\Rightarrow 1]\\= & {} (\Pr [\mathscr {A}^{{\text {OCB} }[\mathbb {E},\tau ].\mathcal {E},\,{\text {OCB} }[\mathbb {E},\tau ].\mathcal {D}}\Rightarrow 1] - \Pr [\mathscr {A}^{{\Theta \text {{CB}} } [{\smash {\widetilde{\mathbb {E}}}},\tau ].\mathcal {E},\,{\Theta \text {{CB}} } [{\smash {\widetilde{\mathbb {E}}}},\tau ].\mathcal {D}}\Rightarrow 1]) +\\&(\Pr [\mathscr {A}^{{\Theta \text {{CB}} } [{\smash {\widetilde{\mathbb {E}}}},\tau ].\mathcal {E},\,{\Theta \text {{CB}} } [{\smash {\widetilde{\mathbb {E}}}},\tau ].\mathcal {D}}\Rightarrow 1]) - \Pr [\mathscr {A}^{\$,\,\bot }\Rightarrow 1])\\= & {} {\mathbf \mathbf{Adv}}^{{{\mathcal{B}}\text{- }\mathrm {prp}}}_{\text {Tw} [\mathbb {E},\tau ]}(\mathscr {D}) + {\mathbf \mathbf{Adv}}^{\mathrm {ae}}_{{\Theta \text {{CB}} } [{\smash {\widetilde{\mathbb {E}}}},\tau ]}(\mathscr {A})\\\le & {} 6(\sigma +2q)^2/2^n + 1.1\,q_{{\mathrm {d}}}/2^\tau . \end{aligned}$$

The first term is by Lemma 4 and the observations made previously concerning adversary $\mathscr {D}$’s attack. The second term is by Lemma 6. $\square $

Finally we can quantify the security of OCB when instantiated with a “real” blockcipher E.

Theorem 1

Fix $n=128$ and $\tau \in [0 {..}n]$ and a blockcipher $E\!:\mathcal {K}\times {\{0,1\}}^n\rightarrow {\{0,1\}}^n$. Let $\Pi = \text {OCB} [E,\tau ]$. Let $\mathscr {A}$ be an adversary that asks at most q queries, at most $q_{{\mathrm {d}}}$ of them decryption queries, and at most $\sigma $ blocks. Then there is an adversary $\mathscr {A}_{\mathrm {prp}}$, explicitly constructed by the proof of this theorem, for which

$$\begin{aligned} {\mathbf \mathbf{Adv}}^{\smash {\pm \mathrm {prp}}}_E(\mathscr {A}_{\mathrm {prp}})\ge & {} {\mathbf \mathbf{Adv}}^{\mathrm {ae}}_\Pi (\mathscr {A}) - 6(\sigma +2q)^2/2^n - 1.1\,q_{{\mathrm {d}}}/2^\tau \;. \end{aligned}$$

Its time complexity is similar to that of $\mathscr {A}$, and it asks at most $\sigma +2q$ queries.

Proof

Adversary $\mathscr {A}_{\mathrm {prp}}$ is constructed as follows. It runs $\mathscr {A}$. When $\mathscr {A}$ asks an encryption query of (N, A, M) adversary $\mathscr {A}_{\mathrm {prp}}$ services this query by using the OCB construction with tag length $\tau $, but employing its own left oracle in lieu of each forward blockcipher computation. When $\mathscr {A}$ asks a decryption query of (N, A, C) adversary $\mathscr {A}_{\mathrm {prp}}$ services this query by using the OCB construction with tag length $\tau $, but employing its own left oracle in lieu of each forward blockcipher computation and employing its right oracle in lieu of each inverse blockcipher computation. Now

$$\begin{aligned} {\mathbf \mathbf{Adv}}^{\mathrm {ae}}_\Pi (\mathscr {A})= & {} \Pr [\mathscr {A}^{{\text {OCB} }[E,\tau ].\mathcal {E},\,{\text {OCB} }[E,\tau ].\mathcal {D}} \Rightarrow 1] - \Pr [\mathscr {A}^{\$,\bot } \Rightarrow 1] \\= & {} (\Pr [\mathscr {A}^{{\text {OCB} }[E,\tau ].\mathcal {E},\,{\text {OCB} }[E,\tau ].\mathcal {D}} \Rightarrow 1] - \Pr [\mathscr {A}^{{\text {OCB} }[\mathbb {E},\tau ].\mathcal {E},\,{\text {OCB} }[\mathbb {E},\tau ].\mathcal {D}} \Rightarrow 1]) +\\&(\Pr [\mathscr {A}^{{\text {OCB} }[\mathbb {E},\tau ].\mathcal {E},\,{\text {OCB} }[\mathbb {E},\tau ].\mathcal {D}} \Rightarrow 1] - \Pr [\mathscr {A}^{\$,\bot } \Rightarrow 1]) \\\le & {} {\mathbf \mathbf{Adv}}^{\smash {\pm \mathrm {prp}}}_E(\mathscr {A}_{\mathrm {prp}}) + 6(\sigma +q)^2/2^n + 2q_{{\mathrm {d}}}/2^\tau \end{aligned}$$

by the definition of $\mathscr {A}_{\mathrm {prp}}$ and Lemma 7. The result follows. $\square $

8 Conclusions

Limitations. There are three limitations on OCB that we would like to emphasize. The first is that OCB’s security crucially depends on the encrypting party not repeating a nonce. The mode should never be used in situations where that can’t be assured; one should instead employ a misuse-resistant AE scheme [41]. These include AES-GCM-SIV [13, 14], COLM [2], and Deoxys-II [23]. A second limitation of OCB is its birthday-bound degradation in provable security. This limitation implies that, given OCB’s 128-bit block size, one must avoid operating on anything near $2^{64}$ blocks of data. The RFC on OCB [27] asserts that a given key should be used to encrypt at most $2^{48}$ blocks (4 petabytes), including the associated data. Practical AE modes that avoid the birthday-bound degradation in security are now known [14, 19, 20, 23, 40]. Finally, OCB uses both the forward and the backward direction of its underlying blockcipher. Minematsu has devised an elegant approach for avoiding use of a blockcipher’s inverse in an OCB-like construction [33].

The OCB2 bug. We mentioned in the Introduction the devastating bug on OCB2 discovered by Inoue, Iwata, Minematsu, and Poettering [17]. What was the root cause of this bug [17], and why does OCB3 escape it? We offer two high-level explanations.

One is to claim that OCB2—and OCB1 and OCB3 as well—are over-optimized. In particular, the constructions twice omit post-whitening xors [38], switching from the XEX construction to the XE construction [30, 38] for enciphering the Checksum and for Vernam-style encrypting the final block of plaintext. (OCB2 and OCB3 also use XE for processing the AD.) Shaving off these two xors increased conceptual complexity and proof complexity for an insignificant improvement in speed.

But the intermixing of XE and XEX constructions is not by itself a problem; what made this turn out so badly for OCB2 was the muddy (and therefore error-prone) abstraction boundary for dealing with TBCs that combine XE and XEX [38, Section 8]. Rather than partitioning the tweak space into unidirectional and bidirectional tweaks, as we did with OCB3, a one-bit tag was appended to each tweak—but then the tag was not treated as an intrinsic part of the tweak. This failed to properly capture the intent that things like $\varvec{\pi }_4^N$ (with a bold $\varvec{\pi }$) and $\bar{\pi }_4^N$ (with a bar over the $\pi $) and $\pi _4^N$ in [38, Fig. 1] all need to be instantiated so as to approximate independent permutations. Properly formalizing a partitioned-tweakspace PRP helped steer us clear of the error to which OCB2 succumbed.

ISO standardization. The OCB2 bug was made more significant because it had been standardized in ISO/IEC 19772 [18]. Unfortunately, nobody on that standardization committee had told us that they were planning to standardize this scheme. Had they, we would have explained that we had already abandoned OCB2 in favor of a cleaner and more empirically informed redesign, OCB3, an unpublished draft of which already existed. The same ISO standard included another faulty AE scheme: a badly misspecified version of encrypt-then-MAC [34]. These two problems suggest, to us, that a rather insular standardization process was then in place.

Patents. At present, OCB is not widely used. This is largely due to there having been potentially relevant patents held by multiple parties. This was the reason that OCB1 fell out of being a mandatory mechanism of 802.11 [16], and the reason it was not subsequently selected for a NIST Special Publication. The limited adoption of OCB exemplifies the often poisonous impact of patents on cryptographic practice.

The second author, who held the patents most directly reading against OCB, has renounced all OCB-relevant IP; all OCB-relevant patents of his have been placed in the public domain. While Rogaway had previously licensed all open-source software and all software for non-military use, IP concerns did not go away.

Final remark. As explained in Sect. 5, the initial reason for developing OCB1 was to improve the efficiency of authenticated encryption. But it soon became clear to us that AE schemes had a second benefit: that they could improve usability. By moving closer to what an ideal encryption scheme would deliver (and, also, by adjusting an encryption-scheme’s syntax to include AD and avoid per-message randomness) one could make symmetric encryption less prone to misuse and less reliant on fussy composition techniques [4, 34]. OCB, and AE more broadly, helped birth a reconceptualization of symmetric encryption. One abandons weak security notions and schemes that are hard to correctly use and moves toward strong security notions and schemes that are easier to use well. This, not speed, may be the lasting legacy of AE.

Acknowledgements

The authors thank all who were involved in the CAESAR effort, starting with its organizer, Dan Bernstein, and fellow submitters. We thank Tetsu Iwata for his work in organizing this JoC special issue, and the referees who returned helpful comments on our manuscript. Thanks to the NSF for many years of support. NSF 0904380 was used for the development of OCB3, NSF 1314885 funded renewed work for the CAESAR submission, and NSF 1717542 funded the preparation of this manuscript.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Nächster Artikel A Cryptographic Analysis of the TLS 1.3 Handshake Protocol

M. Abadi and P. Rogaway. Reconciling two views of cryptography (the computational soundness of formal encryption). J. Cryptology 15(2), pp. 103–127, 2002MathSciNetCrossRef

E. Andreeva, A. Bogdanov, N. Datta, A. Luykx, B. Mennink, M. Nandi, E. Tischhauser, K. Yasuda. COLM v1. September 15, 2016. CAESAR submission. https://competitions.cr.yp.to/

M. Bellare, A. Desai, E. Jokipii, and P. Rogaway. A concrete security treatment of symmetric encryption: analysis of the DES modes of operation. Proceedings of 38th Annual Symposium on Foundations of Computer Science (FOCS 97), IEEE, 1997

M. Bellare, C. Namprempre. Authenticated encryption: relations among notions and analysis of the generic composition paradigm. J. Cryptol., 21(4), pp. 469–491, 2008MathSciNetCrossRef

D. Bernstein. The Poly1305-AES message-authentication code, in FSE 2005, LNCS vol. 3557 (Springer, New York, 2005), pp. 32–49

D. Bernstein, D. Bernstein. ChaCha, a variant of Salsa20. https://cr.yp.to/chacha/chacha-20080128.pdf. Manuscript, 2008

D. Bernstein. CAESAR competition webpage at https://competitions.cr.yp.to/caesar-submissions.html Last visited 2020.06.19

Cryptographic Engineering Research Group (CERG). Hardware Benchmarking of CAESAR Candidates. Webpage https://cryptography.gmu.edu/athena/index.php?id=CAESAR. Last visited 2019.12.19

M. Dworkin. Recommendation for block cipher modes of operation: the CCM mode for authentication and confidentiality. NIST Special Publication 800-38C. May 2004

10.

M. Dworkin. Recommendation for block cipher modes of operation: Galois/Counter Mode (GCM) and GMAC. NIST Special Publication 800-38D. November 2007

11.

V. Gligor, P. Donescu. Fast encryption and authentication: XCBC encryption and XECB authentication modes, in FSE 2001, LNCS vol. 2355, (Springer, New York, 2001), pp. 92–108

12.

S. Goldwasser and S. Micali. Probabilistic encryption. Journal of Computer and System Sciences, vol. 28, April 1984, pp. 270–299MathSciNetCrossRef

13.

S. Gueron, A. Langley, Y. Lindell: AES-GCM-SIV: Nonce misuse-resistant authenticated encryption. IETF RFC 8452. April 2019

14.

S. Gueron, Y. Lindell: GCM-SIV: Full nonce misuse-resistant authenticated encryption at under one cycle per byte, in Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security 2015

15.

E. Homsirikamol, K. Gaj. Toward a new HLS-based methodology for FPGA benchmarking of candidates in cryptographic competitions: the CAESAR contest case study. International Conference on Field Programmable Technology (ICFPT 2017). IEEE Press, pp. 120–127, 2017

16.

IEEE Standard 802.11i-2004. Part 11: Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specifications: Medium Access Control (MAC) Security Enhancements. 2004

17.

A. Inoue, T. Iwata, K. Minematsu, B. Poettering. Cryptanalysis of OCB2: attacks on authenticity and confidentiality, in Advances in Cryptology—CRYPTO 2019 (Springer, New York, 2019)

18.

ISO/IEC 19772. Information technology-Security techniques-Authenticated encryption. First edition, 2009-02-15. OCB scheme within standard susbsequently deprecated

19.

T. Iwata. Authenticated encryption mode for beyond the birthday bound security. AFRICACRYPT 2008, LNCS vol. 5023, Springer, pp. 125–142, 2008

20.

T. Iwata. New blockcipher modes of operation with beyond the birthday bound security. FSE 2006, LNCS 4047, pp. 310–327, 2006

21.

T. Iwata, K. Yasuda. BTM: a single-key, inverse-cipher-free mode for deterministic authenticated encryption. SAC 2009, LNCS vol. 5667, Springer, pp. 313–330, 2009

22.

T. Iwata, K. Yasuda. HBS: a single-key mode of operation for deterministic authenticated encryption. FSE 2009, LNCS vol. 5665, Springer, 2009, pp. 394–415

23.

J. Jean, I. Nikolić, T. Peyrin, Y. Seurin: Deoxys v1.41. October 12, 2016. CAESAR submission. https://competitions.cr.yp.to/

24.

C. Jutla, Encryption modes with almost free message integrity. EUROCRYPT 2001, LNCS vol. 2045, Springer, pp. 529–544, 2001

25.

T. Krovetz. OCB for block ciphers without 128-bit blocks. Internet Draft draft-krovetz-ocb-wideblock-00. May 11, 2018

26.

T. Krovetz, P. Rogaway. The software performance of authenticated-encryption modes. FSE 2011, LNCS vol. 6733, Springer, pp. 306–327, 2011

27.

T. Krovetz, P. Rogaway. The OCB authenticated-encryption algorithm. IETF RFC 7253. May 29, 2014

28.

J. Katz, M. Yung. Unforgeable encryption and adaptively secure modes of operation. FSE 2000, LNCS vol. 1978, Springer, 2001

29.

T. Krovetz, P. Rogaway. The OCB authenticated-encryption algorithm. Internet Draft draft-krovetz-ocb-00.txt. March 2005

30.

M. Liskov, R. Rivest, D. Wagner. Tweakable block ciphers. CRYPTO 2002, LNCS vol. 2442, Springer, pp. 31–46, 2002

31.

D. McGrew, J. Viega. The security and performance of the Galois/Counter Mode (GCM) of operation. INDOCRYPT 2004, LNCS vol. 3348, Springer, pp. 343–355, 2004. Also Cryptology ePrint report 2004/193, with somewhat different performance results

32.

Y. Nir, A. Langley. ChaCha20 and Poly1305 for IETF Protocols. IETF RFC 8439. May 2015

33.

K. Minematsu: Parallelizable rate-1 authenticated encryption from pseudorandom functions. EUROCRYPT 2014, LNCS vol. 8441, Springer, pp. 275–292, 2014

34.

C. Namprempre, P. Rogaway, T. Shrimpton. Reconsidering generic composition. EUROCRYPT 2014. LNCS vol. 8441, Springer, pp. 257–274, 2014

35.

E. Rescorla. The Transport Layer Security (TLS) Protocol Version 1.3. IETF RFC 8446. August 2018

36.

R. Reyhanitabar, S. Vaudenay, and D. Vizár. Authenticated encryption with variable stretch. ASIACRYPT (1), pp. 396–425, 2016MathSciNetMATH

37.

P. Rogaway. Authenticated-encryption with associated-data. CCS 2002, ACM Press, 2002

38.

P. Rogaway. Efficient instantiations of tweakable blockciphers and refinements to modes OCB and PMAC. ASIACRYPT 2004, LNCS vol. 3329, Springer, pp. 16–31, 2004

39.

P. Rogaway, M. Bellare, and J. Black. OCB: A block-cipher mode of operation for efficient authenticated encryption. ACM Transaction on Information and System Security, 6(3), pp. 365–403, 2003. Earlier version, with T. Krovetz, in CCS 2001

40.

T. Peyrin, Y. Seurin: Counter-in-Tweak: Authenticated encryption modes for tweakable block ciphers. CRYPTO 2016, LNCS vol. 9814, Springer, pp. 33–63, 2016

41.

P. Rogaway, T. Shrimpton. A provable-security treatment of the key-wrap problem. EUROCRYPT 2006, LNCS vol. 4004, Springer, pp. 373–390, 2006

42.

SUPERCOP website. Toolkit and website from VAMPIRE Lab. https://bench.cr.yp.to/supercop.html. Last visited 2019.12.08

43.

D. Whiting, R. Housley, N. Ferguson. AES encryption & authentication using CTR mode & CBC-MAC. IEEE P802.11 doc 02/001r2, May 2002

44.

H. Wu, B. Preneel. AEGIS: a fast authenticated encryption algorithm. SAC 2013, LNCS vol. 8282, pp. 185–201. Also CAESAR submission document https://competitions.cr.yp.to/round3/aegisv11.pdf. 2016

Titel: The Design and Evolution of OCB
verfasst von: Ted Krovetz
Phillip Rogaway
Publikationsdatum: 01.10.2021
Verlag: Springer US
Erschienen in: Journal of Cryptology / Ausgabe 4/2021
Print ISSN: 0933-2790
Elektronische ISSN: 1432-1378
DOI: https://doi.org/10.1007/s00145-021-09399-8

Springer Professional

Abstract

Publisher's Note

1 Introduction

2 Preliminaries

3 Specification 1

4 Specification 2

4.1 The TBC-Based Scheme

4.2 Instantiating the TBC

5 Evolution

6 Performance

7 Security

7.1 Security of the TBC-Based Scheme

7.2 Stretch-then-Shift Hash

7.3 Instantiating the TBC

7.4 Wrapping Up

8 Conclusions

Acknowledgements

Publisher's Note

Weitere Artikel der Ausgabe 4/2021

Unconditionally Secure Computation Against Low-Complexity Leakage

Internal Symmetries and Linear Properties: Full-permutation Distinguishers and Improved Collisions on Gimli

Fast Secure Two-Party ECDSA Signing

Correction to: Editorial

Injective Trapdoor Functions via Derandomization: How Strong is Rudich’s Black-Box Barrier?

Compact Designated Verifier NIZKs from the CDH Assumption Without Pairings