The most frequently used algorithms for basis reduction are variants of the LLL algorithm of Lenstra et al. [
18] and the BKZ algorithm of Schnorr and Euchner [
32]. On the other hand, when primal or dual reductions are done for blocks of size at most
s only (with fixed
s), the currently best guarantees for the reduced basis are given—when
s divides the dimension—by the slide reduction algorithm of Gama and Nguyen [
11]. They showed that slide reduction yields a Hermite exponent bounded by the Mordell constant
\(\mu _s\) and a modified approximation exponent (cf. (
31)) bounded by
\(2\mu _s\),
$$\begin{aligned} h(g)\le \mu _s,~~~\widetilde{a}_{n-s+1}(g) \le 2\mu _s, \end{aligned}$$
(32)
appropriate since for a slide reduced basis
B, the block
\(B^{n-s+1:n}\) is primal reduced. Similar, only slightly inferior results were proved by Li and Wei [
19] when the maximal block size does not divide the dimension.
We then introduce a simple way to analyze the self-dual SDBKZ variant of the BKZ reduction algorithm, recently introduced by Micciancio and Walter [
21], improving on the dynamical system technique of Hanrot et al. [
14]. We reprove their bound
\(\mu _s\) for the Hermite exponent [matching the first slide inequality in (
32)] and prove a polynomial complexity result conjectured in [
21]. The known techniques seem not sufficient to prove a bound for the approximation exponent of SDBKZ.
3.2 Greedy LLL algorithms
To turn the general recipe into an efficient algorithm we must decide upon the order in which Lagrange steps are performed. Traditionally, these are chosen in a way determined by a fixed loop structure. In this section we consider greedy choices where in each step some utility measure is maximized. The measure in which we want to be greedy must be chosen carefully, in view of the following statement by Lovász on greediness in basis reduction:
”It seemed that the less greedy you were, the better it worked. So I only swapped neighboring vectors and only swapped when you really made progress by a constant factor.” (Smeets [
34, p.11]).
Storjohann [
36, p. 13] suggested to perform each Lagrange step on the block
\(B^{k:k+1}\) for which the lower bound
\(\delta _k\) from (
33) on the amount that
\(g_k\) decreases in a Lagrange reduction is largest. We shall call an algorithm that completes this description by a tie-breaking rule a
basic greedy LLL algorithm. The basic greedy strategy can be observed experimentally to outperform many others. It was rediscovered by Zhao et al. [
39] in the context of (low-dimensional) signal processing applications. Another greedy variant of LLL (and of slide reduction) was considered by Schnorr [
31].
When
\(\beta \) is large and
n is fixed, a basic greedy LLL algorithm typically performs only
\(O(1+\lg \beta )\) Lagrange reductions, which is much less than the bound (
40). While a complexity bound of
\(O(1+\lg \beta )\) Lagrange reductions was proved by Hanrot et al. [
14] for a
cyclic LLL algorithm that performs Lagrange reductions on the blocks
\(B^{k:k+1}\) in increasing cyclic order, it seems to be impossible to prove for the basic greedy LLL algorithm an unconditional logarithmic complexity result. Schnorr [
31] obtained only partial results, and had to assume an obscure technical condition with an early termination exit that endangers the quality of the reduced basis.
The main difficulty in the analysis is the possibility that the bit profile (which in the most typical cases has—apart from small randomly looking deviations—an essentially concave, nearly quadratic shape, reflected in a nearly monotone decreasing
\(e_i\) sequence) may exhibit large discontinuities. The top example of Fig.
1 illustrates such an atypical case from applications. Even more atypical cases arise when the effective dimension is less than the full dimension. Although one expects these cases to be reduced even more quickly than the regularly shaped ones, the tools presently available do not seem to allow one to demonstrate this.
2
The technical obstacles can be overcome by changing the measure according to which the greedy choice is made.
A
special greedy LLL algorithm applies Lagrange reductions always to blocks
\(B^{k:k+1}\) that maximize the scaling invariant number
$$\begin{aligned} \Delta _k:=\min \{c_k-2\Gamma _k,(k+1)g_k-kg_{k+1}\}, \end{aligned}$$
(43)
where
\(c_k\) is given by (
34), until all
\(\Delta _k<\Delta \), where
\(\Delta >0\) is a small threshold. We may analyse the behavior of this greedy rule in terms of the
potential
$$\begin{aligned} p:=\sum _{i=1}^{n-1} \Big ((k+1)g_k-kg_{k+1}\Big )_+. \end{aligned}$$
(44)
The proof of Theorem
3.1 shows that (
38), which is the area under the bit profile, is a reasonable measure of how far a basis is from being reduced. If all terms in the sum (
44) are positive then
\(p=2A(g)-(n-1)g_n\) is, up to a constant shift, twice this area; in general,
p may be larger, accounting for an irregular behavior of the bit profile.
The potential is a convex, nonnegative function of the
\(g_i\). Therefore it attains its maximum at a vertex of any convex constraint set. Given only the dimension
n and the maximal bit size
\(\beta \) of a basis with integral coefficients, the maximum potential with the constraints (
41) is attained for a profile where all
\(g_i\in \{0,i\beta _i\}\). Writing
\(K:=\{i \mid g_i=0\ne g_{i-1}\}\) we find that the worst case for the potential has the form
\(p=\displaystyle \sum _{i\in K} 2i(i-1)\beta _{i-1}\). This is largest when
\(K=\{n,n-2,n-4,\ldots \}\), leading to an initial bound of
$$\begin{aligned} p^{{{\text{ init }}}}\le p^{\max } =\left\{ \!\! \begin{array}{cc} \displaystyle \sum _{j=1}^{n/2} 4j(2j-1)\beta _{2j-1} &{} (n\, \hbox {even})\\ \displaystyle \sum _{j=1}^{(n-1)/2} 4j(2j+1)\beta _{2j} &{} (n\, \hbox {odd})\end{array}\!\!\right\} \le \frac{n(n+2)(4n+1)}{3}(\beta +O(\lg n)). \end{aligned}$$
The following theorem shows that a special greedy LLL algorithm has a marginally better complexity than the cyclic LLL algorithm of Hanrot et al. [
14], and at the same time gives stronger guarantees for the resulting reduced basis. (Hanrot et al. prove a constant bound on the Hermite exponent, but their method of analysis is unable to bound the approximation exponent.)
For example, if we start with a basis in Hermite normal form then \(g_1=\ldots =g_n=\beta \), hence \(p_2=\ldots =p_n=\beta \), hence \(p^{{{\text{ init }}}}=(n-1)\beta \), and we find \(N_{\text{ tot }}=O(n^3\log (1+\beta /n^2))\).
A basis is LLL reduced in the traditional sense if the second alternative in (
45) holds for all
k. This is guaranteed by our theorem only when no final
\(p_k\) is tiny or negative. In view of (
43), tiny or negative
\(p_k\) indicate a temporary barrier for reduction, which may or may not be lifted in later iterations. The final reduced basis is LLL reduced only in case all such barriers are ultimately lifted. However, the greedy LLL reduction guarantees for the most important key quality measures the same bounds (
48) as a fully LLL reduced basis. (If needed, a fully LLL reduced basis can be obtained by continuing the LLL reduction as long as at least one of the reductions improves some
\(g_k\) by
\(>\frac{1}{2}\Delta \).)
If a basis
B is greedy LLL reduced, the mean slope
\(a_n/(n-1)\) is bounded by the dimension-independent constant
\(2\Gamma _2=2-\lg 3\approx 0.415\) obtained from (
48). For random reduced bases, the factor is better. A Lagrange reduced and size reduced Cholesky factor
\(\left( \begin{array}{cc} r_1 &{} sr_1 \\ 0 &{} r_2\end{array}\right) \) has
\(r_1^2/r_2^2\le 1/(1-s^2)\), hence
\(a_2=\lg (r_1^2/r_2^2)\le -\lg (1-s^2)\). Thus the expectation
\(\langle a_2\rangle \) of
\(a_2\) is bounded by
$$\begin{aligned} \langle a_2\rangle \le \overline{a}_2:=\langle -\lg (1-s^2)\rangle =-\langle \ln (1-s^2)\rangle /\ln 2. \end{aligned}$$
For example,
$$\begin{aligned} \overline{a}_2=\left\{ \begin{array}{ll}\Big (2-\ln \displaystyle \frac{27}{4}\Big )/\ln 2 \approx 0.1305 &{} \hbox {if}\,\, s\ \hbox {is uniformly distributed in}\ \Big [-\displaystyle \frac{1}{2},\frac{1}{2}\Big ],\\ \Big (1+3\ln \displaystyle \frac{3}{4}\Big )/\ln 2 \approx 0.1976 &{} \hbox {if}\,\, s^2\ \hbox {is uniformly distributed in} \ \Big [0,\displaystyle \frac{1}{4}\Big ].\end{array} \right. \end{aligned}$$
The empirical bound 0.16 for LLL-reduced bases of random lattices, calculated from remarks in Nguyen and Stehlé [
24], is somewhere in between.
3.3 SDBKZ reduction
In 2011, Hanrot et al. [
14] introduced a variant of the BKZ algorithm of Schnorr and Euchner [
32] that organized individual primal reduction steps into tours, in a way that the effect of a whole tour can be quantified. Hanrot et al. showed that exploiting the bit size inequalities introduced above reduces much of the complexity analysis to a study of linear equations and inequalities. Before their work, this underlying linear structure was invisible since the analysis was—with the single exception of Schönhage [
33, Lemma 4.1]—always done in a multiplicative way.
Micciancio and Walter [
21] use this technique to partially analyze a self-dual variant of the BKZ algorithm called SDBKZ. In this algorithm, given some block size
s (
\(2<s<n\)), tours of primal reduction of the blocks
\(B^{i:i+s-1}\) for
\(i=1,\ldots ,n-s\) and tours of dual reduction of the blocks
\(B^{i-s+2:i+1}\) for
\(i=n-1,n-2,,\ldots ,s\) alternate until no reduction gives an improvement.
3 Assuming termination of the algorithm (which apparently happens in practice but was not demonstrated in theory) they proved for the resulting reduced basis the same bound
\(\mu _s\) on the Hermite exponent as one has for a slide reduced basis.
In the following, we simplify the complicated analysis of Hanrot et al. [
14]. In particular, we present—as conjectured in [
21]—a way to terminate the SDBKZ algorithm in polynomial time while essentially preserving the theoretical guarantees of the original SDBKZ. Moreover, the analysis suggests a way to skip certain reduction steps in the tours without compromising the quality of the output, thereby speeding up the algorithm.
Our analysis of the SDBKZ algorithm is based on a new global measure of the quality of a basis. As we saw in the analysis of the LLL algorithm, basis reduction amounts to shrinking the bit profile by making the bit sizes \(g_i\) smaller. This can be done independently when the block size is \(s=2\), which allows an elegant analysis of LLL algorithms. However, reducing one of the \(g_i\) by primal or dual reduction of a block of size \(s>2\) has an effect on some of the neighboring bit sizes that is not easy to quantify. One therefore needs to look for a suitable quality measure that has a predictable behavior under block primal or dual reduction.
The basic new idea is to consider the tightest parabolic dome that sits above the bit profile
\(g_0,\ldots ,g_n\) and interpolates the two end points. By setting up the interpolation conditions one finds that the curvature of the dome is characterized by the
bit defect
$$\begin{aligned} \widetilde{\mu }:=\max _i\frac{1}{n-i}\Big (\frac{g_i}{i}-\frac{g_n}{n}\Big ). \end{aligned}$$
(57)
In particular,
\(\widetilde{\mu }\) is an upper bound for the Hermite exponent (
24). When the bit defect is large, this dome is highly curved, and one expects to be able to gain a lot through reduction, while when the bit defect is tiny or even negative, this dome is flat or has a bowl shape, and only little can be gained.
One may now consider how the bit defect changes when one or more primal or dual reductions are applied to a basis. It turns out that this indeed works well for the cyclic BKZ algorithm analyzed in [
14]; cf. the remarks further below. However, in order to apply the idea to the SDBKZ algorithm (which has the better theoretical bound on the Hermite exponent), we need to take into account that this algorithm does not perform any reductions of small blocks at the lower and upper end. For optimal results, one therefore needs to work with truncated versions of the bit defect, defined for a fixed block size
\(s>2\). The
primal bit defect
$$\begin{aligned} \mu :=\max _{i\le n-s}\frac{1}{n-i}\Big (\frac{g_i}{i}-\frac{g_n}{n}\Big ), \end{aligned}$$
is the smallest number
\(\mu \) such that
$$\begin{aligned} \frac{g_i}{i}-\frac{g_n}{n}\le (n-i)\mu ~~~\text{ for } i=1,\ldots ,n-s. \end{aligned}$$
(58)
In particular, the case
\(i=1\) says that the Hermite exponent (
24) satisfies
\(h(g)\le \mu \). If at some point
\(\mu \le \mu _s\), this implies the bound
\(\mu _s\) on the Hermite exponent guaranteed for slide reduction. Similarly, the
dual bit defect
$$\begin{aligned} \mu ^\dagger :=\max _{i\ge s}\frac{1}{n-i}\Big (\frac{g_i}{i}-\frac{g_n}{n}\Big ), \end{aligned}$$
is the smallest number
\(\mu ^\dagger \) such that
$$\begin{aligned} \frac{g_i}{i}-\frac{g_n}{n}\le (n-i)\mu ^\dagger ~~~\text{ for } i=s,\ldots ,n. \end{aligned}$$
(59)
A small dual bit defect implies a good Hermite exponent of the dual basis.
The following theorem implies that, when started from an LLL reduced basis, the SDBKZ algorithm comes in polynomial time arbitrarily close to satisfying \(\mu \le \mu _s\).
Without compromising the complexity order, one may run the algorithm in practice for up to \(O(n^2)\) additional tours beyond those needed to reach \(\mu \le \mu ^*\), where \(\mu ^*\) is taken slightly larger than \(\mu _s\). Since the apriori bound derived above is somewhat pessimistic for most (or even all?) lattice bases, a significantly smaller \(\mu \) can typically be achieved. It is clear from the argument that only those reductions must be carried out for which \(g_i\) does not yet satisfy the bound guaranteed by the above analysis. Thus only those reductions are carried out where \(g_i\) is nearly largest when measured in the correct units. This introduces an element of laziness into the algorithm and speeds it up without affecting the worst case bound for the number of tours. For getting the first basis vector small quickly, it is also beneficial to begin the reduction with a dual rather than a primal tour. The reason is that a dual tour transports poor basis vectors towards higher indices and thus improves the quality of the leading blocks, which is then immediately exploited in the first step of the primal tour.
The cyclic variant of the BKZ algorithm analyzed in Hanrot et al. [
14] proceeds by using primal tours only, but these are extended to shorter blocks towards the end of the basis. In this case, a similar analysis works, with the same
N but using the symmetric bit defect defined by (
57). The resulting new proof (whose details are left to the reader) is far simpler than that of [
14] and results in the same convergence rate as given above for SDBKZ, which is a factor of approximately 16 better the bound on the rate derived in [
14]. The final bound on
\(\mu \) and hence the Hermite factor resulting for BKZ is slightly weaker than that for SDBKZ.
Unfortunately, neither the above technique nor the original technique of Hanrot et al. is able to bound the approximation exponent or the enumeration exponent. In particular, unlike BKZ (where Schnorr [
30] gives bounds on the approximation exponent) and slide reduction, SDBKZ is (at present) not guaranteed to find a very short vector in case that
\(\lambda _1(B)\) is much smaller than the trivial Hermite bound
\(\sqrt{\gamma _n d_n^{1/n}}\). (One could of course bound the approximation exponent by performing
O(
n) runs of SDBKZ according to the recipe of Lovász [
20, p. 25].)