Keywords

1 Introduction

The SHA-2 family of hash functions is standardized by NIST as part of the Secure Hash Standard in FIPS 180-4 [21]. This standard is not superseded by the upcoming SHA-3 standard. Rather, the SHA-3 hash functions supplement the SHA-2 family. Thus, it is likely that the SHA-2 family will remain as ubiquitously deployed in the foreseeable future as it is now. Therefore, the continuous application of state-of-the-art cryptanalytic techniques for quantifying the security margin of hash functions of the SHA-2 family is of significant practical importance.

In this work, we focus on the two most recent members of the SHA-2 family, SHA-512/224 and SHA-512/256. As already observed by Gueron et al. [10], using truncated SHA-512 variants like SHA-512/256 gives a significant performance advantage over SHA-256 on 64-bit platforms due to the doubled input block size. At the same time, the shorter 256-bit hash values are more economic, compatible with existing applications, and offer the same security level as SHA-256. In addition, the resulting chop-MD [5] structure of SHA-512/224 and SHA-512/256 with is wide-pipe structure provides cryptographic benefits over the standard Merkle-Damgård [7, 20] structure by prohibiting generic attacks like Joux’ multicollision attack [12], Kelsey and Kohno’s herding and Nostradamus attacks [13], and Kelsey and Schneier’s second preimages for long messages [14].

However, no cryptanalysis dedicated to SHA-512/224 and SHA-512/256 has been published so far. Therefore, we examine the effects of truncating the hash value of SHA-512. We show that due to this truncation, practical free-start collision for 43-step SHA-512/256 and 44-step SHA-512/224 are possible. Moreover, we improve upon the previous best collisions for 24-step SHA-512 [11, 23] and show collisions for 27 steps of SHA-512, SHA-512/224, and SHA-512/256. Since all of our results are practical, we provide examples of colliding message pairs for every attack. Our results are summarized in Table 1 together with previously published collision attacks.

Table 1. Best published collision attacks on the SHA-512 family.

Related Work. No dedicated cryptanalysis of SHA-512/224 or SHA-512/256 has been published so far. However, there is a number of results targeting SHA-512. The security of SHA-512 against preimage attacks was first studied by Aoki et al. [1]. They presented MITM preimage attacks on 46 steps of the hash function. This was later extended to 50 steps by Khovratovich et al. [15]. However, due to the wide-pipe structure of SHA-512/224 and SHA-512/256, these attacks do not carry over to SHA-512/224 and SHA-512/256.

The currently best known practical collision attack on the SHA-512 hash function is for 24 steps. It was published independently by Indesteege et al. [11] and by Sanadhya and Sarkar [23]. Both attacks are trivial extensions of the attack strategy of Nikolić and Biryukov [22] which applies to both SHA-256 and SHA-512. Recently, Eichlseder et al. [9] demonstrated how to extend these attacks to get semi-free-start collisions for SHA-512 reduced to 38 steps with practical complexity. Furthermore, second-order differential collisions for SHA-512 up to 48 steps with practical complexity have been shown by Yu et al. [27]. We want to note that all these practical collision attacks on SHA-512 are also applicable to its truncated variants.

Additionally, Li et al. showed in [17] that particular preimage attacks on SHA-512 can also be used to construct free-start collision attacks for the step-reduced hash function and its truncated variants. They show a free-start collision for 57-step SHA-512 and 40-step SHA-384. Both attacks are only slightly faster than the respective generic attacks.

Outline. The remainder of the paper is organized as follows. We describe the design of the SHA-2 family in Sect. 2. Then, we briefly explain our attack strategy and discuss the choice of suitable starting points for our attacks in Sect. 3. The actual attacks on step-reduced SHA-512/224 and SHA-512/256 are presented in Sect. 4.

2 Description of SHA-512 and Other SHA-2 Variants

The SHA-2 family of hash functions is specified by NIST as part of the Secure Hash Standard (SHS) [21]. The standard defines two main algorithms, SHA-256 and SHA-512, with truncated variants SHA-224 (based on SHA-256) and SHA-512/224, SHA-512/256, and SHA-384 (based on SHA-512). In addition, NIST defines a general truncation procedure for arbitrary output lengths up to 512 bits. Below, we first describe SHA-512, followed by its truncated variants SHA-512/224 and SHA-512/256 that this paper is focused on. Finally, the main differences to SHA-256 and SHA-224 are briefly discussed.

SHA-512. SHA-512 is an iterated hash function that pads and processes the input message using t 1024-bit message blocks \(m_j\). The 512-bit hash value is computed using the compression function f:

$$\begin{aligned} h_0&= \mathrm {IV}, \\ h_{j+1}&= f(h_{j},m_{j}) \qquad \text {for } 0 \le j < t. \\ \end{aligned}$$

The hash output is the final 512-bit chaining value \(h_t\).

In the following, we briefly describe the compression function f of SHA-512. It basically consists of two parts: the message expansion and the state update transformation. A more detailed description of SHA-512 is given by NIST [21].

We use \(+\) (or \(-\)) to denote addition (or subtraction) modulo \(2^{64}\); \(\oplus \) (or \(\wedge \)) is bitwise exclusive-or (or bitwise and) of 64-bit words, and \(\ggg n\) (or \(\gg n\)) denotes rotate-right (or shift-right) by n bits.

Padding and Message Expansion. The message expansion of SHA-512 splits each 1024-bit message block into 16 64-bit words \(M_i\), \(i = 0, \dots , 15\), and expands these into 80 expanded message words \(W_i\) as follows:

$$\begin{aligned} W_i = {\left\{ \begin{array}{ll} M_i &{} 0\le i < 16, \\ \sigma _1(W_{i-2}) + W_{i-7} + \sigma _0(W_{i-15}) + W_{i-16} \quad &{} 16 \le i < 80. \end{array}\right. } \end{aligned}$$
(1)

The functions \(\sigma _0(x)\) and \(\sigma _1(x)\) are given by

$$\begin{aligned} \sigma _0(x)&= (x \ggg 1) \oplus (x \ggg 8) \oplus (x \gg 7), \\ \sigma _1(x)&= (x \ggg 19) \oplus (x \ggg 61) \oplus (x \gg 6). \end{aligned}$$

State Update Transformation. We use the alternative description of the SHA-512 state update by Mendel et al. [18], which is illustrated in Fig. 1.

Fig. 1.
figure 1

The state update transformation of SHA-512.

The state update transformation starts from the previous 512-bit chaining value \(h_{j} = (A_{-1},\dots ,A_{-4},E_{-1},\dots ,E_{-4}\)) and updates it by applying the step functions 80 times. In each step \(i=0,\dots ,79\), one 64-bit expanded message word \(W_i\) is used to compute the two state variables \(E_{i}\) and \(A_{i}\) as follows:

$$\begin{aligned} E_{i}&= A_{i-4} + E_{i-4} + \varSigma _1(E_{i-1}) + \mathrm{{IF}}(E_{i-1},E_{i-2},E_{i-3}) + K_i + W_i, \end{aligned}$$
(2)
$$\begin{aligned} A_{i}&= E_{i} - A_{i-4} + \varSigma _0(A_{i-1}) + \mathrm{{MAJ}}(A_{i-1},A_{i-2},A_{i-3}). \end{aligned}$$
(3)

For the definition of the step constants \(K_i\), we refer to the standard document [21]. The bitwise Boolean functions IF and MAJ used in each step are defined by

$$\begin{aligned} \mathrm{{IF}}(x,y,z)&= (x \wedge y) \oplus (x \wedge z) \oplus z, \\ \mathrm{{MAJ}}(x,y,z)&= (x \wedge y) \oplus (y \wedge z) \oplus (x \wedge z), \end{aligned}$$

and the linear functions \(\varSigma _0\) and \(\varSigma _1\) are defined as follows:

$$\begin{aligned} \varSigma _0(x)&= (x \ggg 28) \oplus (x \ggg 34) \oplus (x \ggg 39), \\ \varSigma _1(x)&= (x \ggg 14) \oplus (x \ggg 18) \oplus (x \ggg 41). \end{aligned}$$

After the last step of the state update transformation, the previous chaining value is added to the output of the state update (Davies-Meyer construction). The result of this feed-forward sum is the chaining value \(h_{j+1}\) for the next message block \(m_{j+1}\) (or the final hash value \(h_t\)):

$$\begin{aligned} h_{j+1} = (A_{79} + A_{-1}, \ldots , A_{76} + A_{-4}, E_{79} + E_{-1}, \ldots , E_{76} + E_{-4}). \end{aligned}$$
(4)

SHA-512/256 and SHA-512/224. These truncated variants of SHA-512 differ only in their initial values and a final truncation to 256 or 224 bits, respectively. The rest of the algorithmic description remains exactly the same. The message digest of SHA-512/256 is obtained by omitting the output words \(E_{79}+E_{-1}\), \(E_{78}+E_{-2}\), \(E_{77}+E_{-3}\), and \(E_{76}+E_{-4}\) of the last compression function call. SHA-512/224 additionally omits the 32 least significant bits of \(A_{76}+A_{-4}\).

SHA-256 and SHA-224. SHA-256 and SHA-512 are closely related. Thus, we only point out properties of SHA-256 which differ from SHA-512:

  • The wordsize is 32 instead of 64 bits.

  • IV and \(K_i\) are the 32 most significant bits of the respective SHA-512 value.

  • The step function is applied 64 instead of 80 times.

  • The linear functions \(\sigma _0, \sigma _1, \varSigma _0\) and \(\varSigma _1\) use different rotation values.

SHA-224 is a truncated variant of SHA-256 with different IV, in which the output word \(E_{60} + E_{-4}\) is omitted.

3 Attack Strategy

Starting from the ground-breaking results of Wang et al. [25, 26], the search techniques used for practical collisions have been significantly improved, hitting their current peak in the attacks on SHA-256 [2, 19] and SHA-512 [9, 27]. In spite of all achieved improvements, the top-level attack strategy has remained essentially the same. At first, a suitable starting point for the search must be determined to define the search space and hopefully make the ensuing search process feasible. The search itself usually involves two phases: The search for a suitable differential characteristic, and the message modification phase to determine a collision-producing message pair for this characteristic. The search for this characteristic and message pair can either be done by hand or, for more complex functions like SHA-2, using an automatic search tool. We use a heuristic search tool based on a guess-and-determine strategy, which we briefly describe in Sect. 3.1. Afterwards, we discuss the choice of suitable starting points in Sect. 3.2.

3.1 Guess-and-Determine Search Tool

To search for differential characteristics and colliding message pairs, we use an automatic search tool, which implements a configurable heuristic guess-and-determine search strategy. Roughly, the tool is partitioned into two separate, but closely interacting parts: The representation of the analyzed cryptographic primitive and the search procedure.

Representation. The tool internally represents differences at bit level, allowing to store all possible stages from a completely unrestricted bit over signed differences down to exact values. Thus, the same tool can be used in the search for a characteristic and in the search for a message pair. The conditions are grouped in words representing the internal variables of the hash function. These words can then be connected with any operations (typically bitwise functions or modular additions) to define the hash function.

Search. The search procedure uses the bitwise conditions as variables, and attempts to find a solving assignment with the help of a heuristic guess-and-determine strategy [8], similar to SAT solvers. The following steps are repeated until a solution is found:

  • Guess: Pick a bit and guess its value (e.g., no difference, or a specific assignment).

  • Determine: The previous guess influences other connected bit conditions. Determine these effects, which might result in further refinement of other bit conditions, or a contradiction.

  • Backtrack: If a contradiction is detected, resolve this conflict by undoing previous guesses and replacing them with other choices.

This simple approach alone is not sufficient to go through the whole search space, so numerous refinements have been proposed to fine-tune this method. These include the detection of two-bit conditions [18], backtracking strategies, and a look-ahead approach to guide the search [9]. Additionally, SHA-2-specific heuristics and strategies [18, 19] have been proposed, deciding which parts of the state to guess with higher priority.

3.2 Finding Starting Points for SHA-2

To model SHA-2 as a satisfiability problem for the search tool, we need to introduce suitable intermediate variables. Based on the alternative description from Sect. 2, we only use the words \(A_i\) and \(E_i\) of the state, plus the words \(W_i\) of the message expansion. Figure 2 illustrates the update rules for A, E and W by highlighting the input words for updating each word: Each row represents one of the 80 step iterations, with its three state words \(A_i\), \(E_i\), and \(W_i\).

Fig. 2.
figure 2

Update rules to compute \(A_i, E_i\), and \(W_i\) (

figure a
) from other state words (
figure b
).

Local Collisions. All our results are based on “local collisions” in the message expansion: by carefully selecting (expanded) message words in the middle steps so that the differences can cancel out in as many consecutive steps as possible in the forward and backward expansion, i.e., the first and last few expanded message words contain no differences. The t middle steps with differences can induce differences in the \(A_i\) and \(E_i\) words. However, the \(W_i\) words can be used to achieve zero difference in the last 4 of the t words \(E_i\), and in the last 8 of the t words \(A_i\). This is necessary to obtain words with zero difference in the very last 4 steps of the state update and thus in the output chaining value.

As an example, the starting point for the 27-step collisions for SHA-256 [18] allows differences in expanded message words \(W_7, W_8, W_{12}, W_{15}\), and \(W_{17}\), as well as state words \(E_7, \ldots , E_{13}\) and \(A_7, \ldots , A_{10}\). The exact bitwise signed differences are chosen during the search such that any potential differences in \(W_{19}, W_{22}, W_{23}, W_{24}\), as well as \(E_{14}, \ldots , E_{17}\) and \(A_{10}, \ldots , A_{13}\) cancel out. The resulting starting point is illustrated in Fig. 3a. We show in Sect. 4.3 how the same starting point can be used for SHA-512.

The semi-free-start collision starting point covering the most steps so far is for 38 steps of SHA-256 [19] and SHA-512 [9], with a local collision spanning \(t=18\) steps. Considering the large number of steps, the number of expanded message words with differences and cancellations is remarkably low: only 6 words with differences, and 6 words imposing cancellation conditions.

To find candidates for a higher number of steps, we enumerated all possible selections of active message words (more precisely, of some \(t \le 20\) intermediate expanded message words, the “core words” of the local collision) and investigated the forward and backward expansion under certain assumptions: the t core words are chosen freely, according to the message expansion rule; in the forward and backward expansion, if at least 2 of the input words have differences, they are assumed to cancel out, while a single input word with difference never cancels out. Criteria for selecting suitable candidates then include a low number t of spanned steps and a low number of required cancellation constraints. The best (consistent) result for 39 steps, spanning \(t=19\) steps with 9 cancellations, is given in Fig. 3b.

Fig. 3.
figure 3

SHA-2 starting points: Words with differences

figure c
and cancellations
figure d
,
figure e
.

Semi-Free-Start Collisions and Collisions. The discussed starting points are targeted to find semi-free-start collisions, that is, different messages \(m, m'\) and an IV \(h_0\) such that \(f(h_0, m) = f(h_0, m')\). However, they can also be used for hash function collisions with the original IV \(h_0\) by trading the freedom of the IV for freedom in the message words.

In order to find hash function collisions, the first few message words \(W_i\) must retain sufficient freedom (i.e., they should not be constrained by conditions from the message expansion for cancelling differences) to allow to match the correct IV value. Ideally, this means that the first 8 message words \(W_0, \ldots , W_7\) are free of any conditions (no differences, but also not constrained by conditions from other message words connected via the message expansion). If the \(W_i\) differences are sparse enough overall, it can also be sufficient to have at least 5 words \(W_0, \ldots , W_4\) free of conditions by providing the remaining freedom with a two-block approach [19].

The starting points of Fig.  3a and 3b both have at least 7 message words free of differences in the beginning. However, the local collision shown in Fig. 3b spans over \(t=19\) steps. Thus, the first message words are constrained by many conditions, leaving not enough freedom to match the correct IV. In contrast, the 11-step local collision shown in Fig. 3a provides enough freedom in the first 7 message words to be used in a single-block collision attack [18].

4 Collision Attacks for Truncated SHA-512 Variants

The hash functions SHA-512/224 and SHA-512/256 differ from SHA-512 in their IV and a final processing step, which truncates the 512-bit state to 224 or 256 bits, respectively. Consequently, the semi-free-start collisions demonstrated for SHA-512 [9] are also valid for these truncated versions (since the IV is non-standard anyway in this attack scenario). In this section, we first improve these results by providing 39-step semi-free-start collisions for SHA-512 and its variants. We then extend this result to free-start collisions for 43-step SHA-512/256 and 44-step SHA-512/224. By free-start collisions, we mean two messages \(m, m'\) and two IVs \(h_0, h_0'\) such that the hash values of m (under IV \(h_0\)) and \(m'\) (under IV \(h_0'\)) collide. Note that free-start collisions are not equivalent to collisions of the compression function for truncated SHA-2 versions, since the truncated output bits of the last compression function call may contain differences. Additionally, we present collisions for 27 steps of SHA-512, SHA-512/224, and SHA-512/256.

4.1 Semi-free-start Collisions

We use the 39-step starting point from Fig. 3b. Previous work showed that sparse differences particularly in the \(A_i\) words are essential for the success probability of the message modification phase. For this reason, we additionally require that in 6 words between \(A_{8}\) and \(A_{18}\), namely \(A_{11}, A_{12}, A_{13}, A_{14}, A_{15}\), and \(A_{17}\), differences also cancel out. The five consecutive zero-difference words in \(A_i\) also force \(E_{15}\) to zero difference. These additional requirements are already marked in Fig. 3b (hatched area).

The first task for the search procedure with the solving tool is to fix a suitable signed characteristic. Compared to the previously published 38-step SHA-512 semi-free-start collision [9], the local collision for our starting point spans 19 steps (compared to previously 18) and has 9 (previously 6) active expanded message words. Cancellations are also required in 9 (previously 6) expanded message words. This increases the necessity for very sparse differences in \(A_i\) and \(W_i\) in steps 16–26. For this reason, we require a single-bit difference in \(W_{26}, W_{17}\) and \(A_{18}\), and very low Hamming weights for the other words. We finally found a characteristic with at most two active bits in almost all words of \(A_i\) and \(W_i\) (except \(A_{9}, A_{10}, W_{11}, W_{12}\)).

After the characteristic is fixed, we need to find a complying message pair. We start by guessing the dense parts in \(A_i\) and \(E_i\), hoping that the sparser conditions in the later steps are fulfilled probabilistically. Since the dense parts are already almost fully determined by the characteristics and the sparse parts pose only so few conditions, a message pair is easily found. The result is a semi-free-start collision valid for all SHA-512 variants. We give an example in Appendix A in Table 4a.

4.2 Free-Start Collisions

Free-start collisions are a generalization of semi-free-start collisions, so the 39-step results obtained in the previous section give a first result for SHA-512/224 and SHA-512/256. However, we can take advantage of the truncated output bits to add several more steps. If we add another step in the beginning or in the end, the existing difference pattern remains unchanged, but there will be differences in the word \(W_0\) (computable via backward expansion, which includes \(W_{i+9} = W_{9}\), the previous \(W_{8}\) from Fig. 3b) or in the new word \(W_{39}\) (via the normal forward expansion, which includes \(W_{39-15} = W_{24}\)), respectively. These, in turn, can imply differences in \(E_{-4}\) or in \(A_{39}\) and \(E_{39}\), which translates to differences in the IV (turning semi-free-start into free-start results, and included in the hash value via the feed-forward) or directly in the compression function output, respectively.

The advantage of adding steps in the beginning is that it is possible to limit the additional differences in the state update words to E, and keep A free of new differences. Any differences in \(E_{-1}, \ldots , E_{-4}\) will be added to the compression function output with the final feed-forward, but the corresponding words of the result are truncated, so the hash outputs still collide.

Free-Start Collisions for 43-Step SHA-512/256. Since SHA-512/256 truncates the last 4 output words of the compression function call (\(E_{79}+E_{-1}\), \(E_{78}+E_{-2}\), \(E_{77}+E_{-3}\), and \(E_{76}+E_{-4}\)), differences in \(E_{-1}, \ldots , E_{-4}\) are acceptable for a free-start collision. This observation allows us to add 4 additional steps in the beginning of the 39-step starting point from Fig. 3b. Shifting the characteristic “downwards” by 4 steps causes the previous message words \(W_{12}, \ldots , W_{15}\) to turn into new expanded message words \(W_{16}, \ldots , W_{19}\); in particular, this affects the difference in the previous word \(W_{12}\). To determine a compatible difference pattern for the new first 4 words, the message expansion can be computed backwards from the new words \(W_4, \ldots , W_{19}\) via

$$\begin{aligned} W_i = W_{i+16} - \sigma _1(W_{i+14}) - W_{i+9} - \sigma _0(W_{i+1}). \end{aligned}$$

It turns out that all 4 new words will contain differences (\(W_3\) from \(W_{3+9} = W_{12}\); \(W_2\) from \(W_{2+1} = W_{3}\) and \(W_{2+14} = W_{16}\); \(W_1\) from \(W_{1+1} = W_2\) and \(W_{1+14} = W_{15}\); and \(W_0\) from \(W_{0+1} = W_1\), \(W_{0+14} = W_{14}\) and \(W_{0+16}=W_{16}\)). However, similar to steps 27–30, the state words \(A_i\) and \(E_i\) can be kept free of differences for 4 steps. To achieve this, the search tool needs to find differences in the IV words \(E_{-4}, \ldots , E_{-1}\) to cancel out those in \(W_0, \ldots , W_3\) when computing \(E_0, \ldots , E_3\). The resulting starting point is given in Fig. 4a.

Fig. 4.
figure 4

Potential free-start starting points (differences

figure f
and cancellations
figure g
,
figure h
).

For the search procedure with the solving tool, we fixed the signed differences of steps 12–30 to the same values as the 39-step SHA-512 semi-free-start collision of Sect. 4.1. Then, to complete the characteristic, we first search for a valid solution for the dense part of the middle steps (\(A_i\) and \(E_i\) in steps 13–16, and \(E_i\) in steps 17–27), and finally fix the corresponding message words \(W_i\) in steps 13–17, which determines the complete state, including the dense differences in the prepended steps and IV.

The search only takes seconds on a standard computer; an example for a free-start collision is given in Appendix A in Table 3a.

Free-Start Collisions for 44-Step SHA-512/224. A very similar strategy can be employed to extend the previous 43-step free-start collision by another step for SHA-512/224. Prepending an additional step shifts the difference of previous word \(E_{-1}\) to \(E_0\), which in turn requires a cancellation in \(A_0\) and a difference in \(A_{-4}\), as illustrated in Fig. 4b. However, only the least significant 32 bits of the corresponding compression function output word are truncated. Furthermore, this output word is computed from \(A_{-4}\) via modular addition, so even differences only in the lower 32 bits can possibly cause differences in the untruncated output bits.

Fortunately, the underlying characteristic of signed differences as used for the 39-step SHA-512 semi-free-start collision is well compatible with our constraints: The difference in \(A_{-4}\) needs to cancel that in \(W_4\) in a modular addition (via \(E_0\), by Eqs. (3) and (2) or Fig. 2, since all other involved words have zero difference). This difference of \(W_4\), in turn, is dictated by that in \(W_{13}\) (by the update rule for \(W_{20}\), where again all other involved words have zero difference). None of these equalities involves any of the bitwise functions \(\sigma , \varSigma \), MAJ or IF. Thus, the modular difference in \(A_{-4}\) must be the same as that in \(W_{13}\), which is already fixed by the underlying characteristic to a modular difference of \(+32\). Written as bitwise differences, this will translate to a single-bit difference (in the sixth least significant bit) with probability \(\frac{1}{2}\) (which does not carry over to the untruncated bits of the final output with overwhelming probability). Indeed, the example for a free-start collision given in Appendix A in Table 2a only displays this single-bit difference in \(A_{-4}\) (and no carries in the output bits).

4.3 Collisions

So far, the best practical collisions found for SHA-512 are those for 24 steps, proposed independently by Sanadhya and Sarkar [23] and Indesteege et al. [11], together with 24-step collisions for SHA-256. While the results for SHA-256 have since been improved to 27 [18], 28 [19] (both practical), and finally 31 steps [19] (theoretical attack with almost practical complexity), no such improvements have been proposed for SHA-512 so far. The main reason for this seems to be the doubling in state size from SHA-256 to SHA-512; this larger search space increases the difficulty of the problem for the search tools.

Starting Point for SHA-512. Since the message expansion is essentially the same for all SHA-2 variants (except for different word sizes and rotation values, of course), the SHA-256 starting points can theoretically also be used for SHA-512. However, the resulting search complexity is different. For our results, we used the 27-step starting point (based on a local collision over the \(t=11\) steps 7–17), as illustrated in Fig. 3a. Just as the 39-step semi-free-start starting point (Fig. 3b), it requires that differences cancel in E in 4 of the t steps (\(E_{14}, \ldots , E_{17}\)) and in A in the 4 previous steps (\(A_{10}, \ldots , A_{13}\)), as well as in several steps of the message expansion.

Finding a solution from this starting point requires significantly more effort than for SHA-256. Of course, we also tried to expand our search to the closely related 28-step starting point, which adds an additional step in the beginning of the 27-step version. However, with the additional constraints imposed on the message expansion by this added step we could not find any suitable (reasonably sparse) characteristics.

In contrast to the results from Sect. 4.2, since the IV needs to exactly match the original IV, we were not able to take advantage of the final truncation to simplify the search process, or add additional steps. We first search a characteristic for SHA-512, and then try to use it to match the different IVs for SHA-512/224 and SHA-512/256.

Search Strategy. The search progresses in several stages, as illustrated in Fig. 5:

Fig. 5.
figure 5

Stages of the 27-step collision search (guessed values

figure i
and differences
figure j
, derived values
figure k
, and previously fixed values
figure l
and differences
figure m
).

  1. 1.

    Fix Signed Characteristic:

    1. (a)

      Find Candidate Characteristic (Fig. 5a): First fix the signed differences of the message expansion W (5 words) and state update A (3 words). Since the word \(W_{17}\) poses conditions on the first few message words, whose freedom we will later need to match the IV, we focus on keeping its signed difference as sparse as possible, with only few difference bits. With much lower priority, also determine the differences in the state update words E (7 words) to complete the signed characteristic. The characteristic is very dense in E, but this only has limited influence on the success of the IV matching phase.

    2. (b)

      Verify Dense Parts (Fig. 5b): Fully determine the values of A and E in the densest steps 7–9 to verify the validity of the candidate characteristic. If necessary, fix any remaining free bits of A and E in steps 10–11. This fully determines \(A_3, \ldots , A_{11}\), \(E_7, \ldots , E_{11}\) and \(W_{11}\).

    To maneuver the search process in the large search space and detect contradictions as soon as possible, we need to apply the look-ahead strategies previously employed for semi-free-start collisions on SHA-512 [9] in this stage (with 16 look-ahead candidates per guess).

  2. 2.

    Message Modification to Match IV: Starting from the best signed characteristics of the previous stage, with the correct IV inserted, find a solution message pair step by step:

    1. (a)

      Match IV (Fig. 5c): Fix the values in the more difficult, heavily constrained words first (\(W_{10}, W_9, W_8, W_7\)). Choosing \(W_{10}\) and \(W_9\) also determines \(A_2\) and \(A_1\) (via \(E_6\) and \(E_5\)). Together with \(W_7\), \(W_8\), and the IV, this determines all values in steps 0–11.

    2. (b)

      Finalize Message for Sparse Parts (Fig. 5d): choosing the 4 remaining message words \(W_{12}, \ldots , W_{15}\) allows to satisfy the remaining, sparse parts of the characteristic in steps 12–26 with high probability.

    Unlike the other stages, guesses are not made randomly here, but systematically word-by-word. Since most conditions are from modular additions, we always start from the least significant bits and proceed towards the more significant bits. This last stage needs to be repeated for each IV separately, which takes some hours on a single CPU per target IV.

Results. Our results for collisions for 27-step SHA-512/224, SHA-512/256, and SHA-512 are given in Appendix A in Tables 2b, 3b, and 4b, respectively.