Top

Published in:

Open Access 2022 | OriginalPaper | Chapter

7. Lattice-Based Cryptography

Author : Zhiyong Zheng

Published in: Modern Cryptography Volume 1

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Patentsearch

Off

Abstract

Let $\mathbb {R}^n$ be an n-dimensional Euclidean space and $x=(x_1, x_2, \ldots , x_n)\in \mathbb {R}^n$ be an n-dimensional vector, x can be a row vector or a column vector, depending on the situation. If $x\in \mathbb {Z}^n$, then x is called a integral point. $\mathbb {R}^{m\times n}$ is all $m\times n$-dimensional matrices on $\mathbb {R}$. $x=(x_1, x_2, \ldots , x_n)\in \mathbb {R}^n$, $y=(y_1, y_2, \ldots , y_n)\in \mathbb {R}^n$, define the inner product of x and y as.

7. Lattice-Based Cryptography

Zhiyong Zheng¹

Zhiyong ZhengModern Cryptography Volume 1Financial Mathematics and Fintech10.1007/978-981-19-0920-7_7

Abstract

7.1 Geometry of Numbers

$$\begin{aligned} \langle x, y \rangle =\sum _{i=1}^n x_iy_i. \end{aligned}$$

(7.1)

The length |x| of vector x is defined as

$$\begin{aligned} |x|=\sqrt{\langle x, x \rangle }=\sum _{i=1}^n x_i^2. \end{aligned}$$

(7.2)

$\lambda \in \mathbb {R}$, then $\lambda \cdot x$ is defined as

$$\begin{aligned} \lambda x=(\lambda x_1, \lambda x_2, \ldots , \lambda x_n). \end{aligned}$$

(7.3)

If the inner product $\langle x, y \rangle =0$ of two vectors x and y, x and y are said to be orthogonal, denote as $x\bot y$.

Lemma 7.1

Let $x, y\in \mathbb {R}^n$, $\lambda \in \mathbb {R}$ is any real number, then

(i)

$|x|\ge 0$, $|x|=0$ if and only if $x=0$ is a zero vector;
(ii)

$|\lambda x|=|\lambda ||x|$, $\forall ~ x\in \mathbb {R}^n$, $\lambda \in \mathbb {R}$;
(iii)

(Trigonometric inequality) $|x+y|\le |x|+|y|$, and $|x-y|\ge ||x|-|y||$;
(iv)

(Pythagorean theorem) If and only if $x\bot y$, we have
$$|x\pm y|^2=|x|^2+|y|^2.$$

Proof

(i) and (ii) can be derived directly from the definition. To prove (iii), let $x=(x_1, x_2, \ldots , x_n), y=(y_1, y_2, \ldots , y_n)\in \mathbb {R}^n$, by H$\ddot{\text {o}}$lder inequality:

$$\left| \sum _{i=1}^nx_iy_i \right| \le \left[ \left( \sum _{i=1}^n x_i^2\right) \left( \sum _{i=1}^ny_i^2\right) \right] ^{\frac{1}{2}}.$$

So there is

$$\begin{aligned} |x+y|^2&= \sum _{i=1}^n (x_i+y_i)^2=\sum _{i=1}^n x_i^2+2\sum _{i=1}^n x_iy_i +\sum _{i=1}^n y_i^2\\&\le \sum _{i=1}^nx_i^2+2 \left| \sum _{i=1}^nx_iy_i \right| +\sum _{i=1}^ny_i^2\\&\le \left( \sqrt{\sum _{i=1}^nx_i^2}+\sqrt{\sum _{i=1}^ny_i^2}\right) ^2=(|x|+|y|)^2, \end{aligned}$$

so (iii) holds. Then, by the definition of inner product,

$$\langle x\pm y, x\pm y \rangle =\langle x,x \rangle \pm 2 \langle x, y \rangle + \langle y, y \rangle ,$$

if $x\bot y$, then

$$|x\pm y|^2=|x|^2+|y|^2.$$

Conversely, if x is not orthogonal to y, then $ \langle x, y \rangle \not =0$, thus

$$|x\pm y|^2\not =|x|^2+|y|^2.$$

Lemma 7.1 holds.

From Pythagorean theorem, for orthogonal vector $x\bot y$, we have the following conclusion,

$$\begin{aligned} |x+y|=|x-y|, \text{ if }\ x\bot y. \end{aligned}$$

(7.4)

Definition 7.1

Let $\mathscr {R}\subset \mathbb {R}^n$ be a subset, $0\in \mathscr {R}$, $\mathscr {R}$ is called a symmetric convex body of $\mathbb {R}^n$, if

(i)

$x\in \mathscr {R}, \Rightarrow -x\in \mathscr {R}$ (Symmetry);
(ii)

Let $x, y\in \mathscr {R}$, $\lambda \ge 0, \mu \ge 0$, and $\lambda +\mu =1$, then $\lambda x+\mu y\in \mathscr {R}$ (Convexity).

The following example is a famous example of a symmetric convex body defined by a set of linear inequalities. Let $A\in \mathbb {R}^{m\times n}$ be an $m\times n$-order matrix, $c=(c_1, c_2, \ldots , c_n)\in \mathbb {R}^n$, and $\forall ~ c_i\ge 0$, $\mathscr {R}(A, c)$ is defined as the set of solutions of $x=(x_1, x_2, \ldots , x_n)\in \mathbb {R}^n$ defined by the following m linear inequalities, let $A=(a_{ij})_{m\times n}$,

$$\begin{aligned} \left| \sum _{j=1}^n a_{ij}x_j \right| \le c_i, \ \ 1\le i\le m. \end{aligned}$$

(7.5)

We have

Lemma 7.2

For any $A\in \mathbb {R}^{m\times n}$, and any positive vector $c=(c_1, c_2, \ldots , c_n)\in \mathbb {R}^n$, then $\mathscr {R}(A, c)$ is a symmetric convex body in $\mathbb {R}^n$.

Proof

Obviously zero vector $x=(0,0, \ldots , 0)\in \mathscr {R}(A, c)$, and if $x\in \mathscr {R}(A, c)\Rightarrow -x\in \mathscr {R}(A, c)$. So we only prove the convexity of $\mathscr {R}(A, c)$. Suppose $x, y\in \mathscr {R}(A, c)$, let

$$z=\lambda x+\mu y, \lambda>0, \mu >0, \lambda +\mu =1.$$

Then for any $1\le i\le m$, we have

$$\begin{aligned}&|a_{i1}z_1+a_{i2}z_2+\cdots +a_{in}z_n|\\&\le \lambda |a_{i1}x_1+a_{i2}x_2+\cdots +a_{in}x_n|+\mu |a_{i1}y_1+a_{i2}y_2+\cdots +a_{in}y_n|\\&\le \lambda c_i+\mu c_i=c_i. \end{aligned}$$

So there is $z=\lambda x+\mu y\in \mathscr {R}(A, c)$. Thus, $\mathscr {R}(A, c)$ is a symmetrical convex body. Lemma 7.2 holds.

Lemma 7.3

Let $\mathscr {R}\subset \mathbb {R}^n$ be a symmetrical convex body, $x\in \mathscr {R}$, then when $|\lambda |\le 1$, we have $\lambda x\in \mathscr {R}$.

Proof

By convexity, let

$$\rho =\frac{1}{2}(1+\lambda ), \ \ \sigma =\frac{1}{2}(1-\lambda ).$$

Then $\rho \ge 0$, $\sigma \ge 0$, and $\rho +\sigma =1$. So there is

$$\rho x+\sigma (-x)=\lambda x\in \mathscr {R}.$$

The Lemma holds.

Lemma 7.4

If $x, y\in \mathscr {R}$, then $\lambda x+\mu y\in \mathscr {R}$, where $\lambda , \mu $ are real numbers, and satisfies $|\lambda |+|\mu |\le 1$.

Proof

Let $\eta _1$ be the sign of $\lambda $ and $\eta _2$ be the sign of $\mu $, then by Lemma 7.3,

$$\begin{aligned} x'&=\eta _1(|\lambda |+|\mu |)x\in \mathscr {R},\\ y'&=\eta _2(|\lambda |+|\mu |)y\in \mathscr {R}. \end{aligned}$$

Let $\rho =\frac{|\lambda |}{|\lambda |+|\mu |}$, $\sigma =\frac{|\mu |}{|\lambda |+|\mu |}$, then $\rho +\sigma =1$. By definition, we have

$$\lambda x+\mu y=\rho x' +\sigma y'\in \mathscr {R},$$

thus the Lemma holds. And this result is not difficult to be extended to the case of n variables.

Lemma 7.5

(Blichfeldt) Let $\mathscr {R}\subset \mathbb {R}^n$ be any region in $\mathbb {R}^n$ and V be the volume of $\mathscr {R}$. If $V>1$, then there are two different vectors $x\in \mathscr {R}, x'\in \mathscr {R}$ so that $x-x'$ is an integral point (thus a nonzero integral point).

Proof

For $\forall x=(x_1, x_2, \ldots , x_n)\in \mathbb {R}^n$, we define

$$\begin{aligned}{}[[x]]=([x_1], [x_2], \ldots , [x_n])\in \mathbb {Z}^n \end{aligned}$$

(7.6)

and

$$\begin{aligned}{}[x]=(\delta _1, \delta _2, \ldots , \delta _n)\in \mathbb {Z}^n, \end{aligned}$$

(7.7)

where $[x_i]$ is the square bracket function of $x_i$ and $\delta _i$ is the nearest integer to $x_i$.

For each integral point $u\in \mathbb {Z}^n$, define

$$\mathscr {R}_u=\{x\in \mathscr {R}|[[x]]=u \}$$

and

$$D_u=\{x-u|x\in \mathscr {R}_u\}.$$

Because $\mathscr {R}_{u_1}\cap \mathscr {R}_{u_2}=\varnothing $, if $u_1\not =u_2$. Thus by $\mathscr {R}=\bigcup _{u}\mathscr {R}_u$,

$$\begin{aligned} \Rightarrow V=\mathrm{Vol}(\mathscr {R})&=\sum _u \mathrm{Vol}(\mathscr {R}_u)\\&=\sum _u V_u>1, \end{aligned}$$

where $V_u=\mathrm{Vol}(\mathscr {R}_u)$. Thus $V_u=\mathrm{Vol}(D_u)$. If $D_u$ is disjoint, then

$$\sum _u V_u=\mathrm{Vol}\left( \bigcup _u D_u \right) \subset [0, 1)\times \cdots \times [0, 1).$$

There is

$$\sum _u V_u\le 1,$$

so there is a contradiction. Therefore, there must be two different integral points u and $u'\ \ (u\not = u')\Rightarrow D_u\cap D_{u'}\not =0$, that is $x-u=x'-u' \Rightarrow x-x'=u-u'\in \mathbb {Z}^n$. The Lemma holds.

Lemma 7.6

(Minkowski) Let $\mathscr {R}$ be a symmetric convex body, and the volume of $\mathscr {R}$

$$V=\mathrm{Vol}(\mathscr {R})>2^n,$$

then $\mathscr {R}$ contains at least one nonzero integer point.

Proof

Let

$$\frac{1}{2}\mathscr {R}=\left\{ \frac{1}{2}x| x\in \mathscr {R}\right\} .$$

Thus

$$\mathrm{Vol}\left( \frac{1}{2}\mathscr {R}\right) =\frac{1}{2^n}V>1,$$

by Lemma 7.5, there are integral points where $x', x''\in \frac{1}{2}\mathscr {R} \Rightarrow x'-x''=u$ is nonzero. We prove $u\in \mathscr {R}$. Write $x'=\frac{1}{2}y, x''=\frac{1}{2}z$, where $y, z\in \mathscr {R}$. Then

$$u=\frac{1}{2}y-\frac{1}{2}z,\ \ y\in \mathscr {R}, \ z\in \mathscr {R}.$$

By Lemma 7.4, then $u\in \mathscr {R}$. The Lemma holds.

Remark 7.1

The above Minkowski’s conclusion cannot be improved, that is $V>2^n$, it cannot be improved to $V\ge 2^n$. A counterexample is

$$\mathscr {R}=\{x\in \mathbb {R}^n| x=(x_1, x_2, \ldots , x_n), \forall ~ |x_i|<1\}.$$

Obviously $\mathrm{Vol}(\mathscr {R})=2^n$, but there is no nonzero integer point in ordinary $\mathscr {R}$.

When $\mathrm{Vol}(\mathscr {R})=2^n$, in order to make a symmetric convex body $\mathscr {R}$ still have nonzero integral points, we need to make some supplementary constraints on $\mathscr {R}$, first, we consider the bounded region. Let $\mathscr {R}\subset \mathbb {R}^n$, call $\mathscr {R}$ bounded, if

$$\mathscr {R}=\{x=(x_1, x_2, \ldots , x_n)\in \mathbb {R}^n| |x_i|\le B, 1\le i\le n\},$$

where B is a bounded constant.

Lemma 7.7

Let $A{\in }\mathbb {R}^{n\times n}$ be a reversible matrix, $d=|\mathrm{det}(A)|>0$, $c=(c_1, c_2, \ldots , c_n)\in \mathbb {R}^n$ is a positive vector, that is $\forall ~c_i>0$, then the symmetric convex body $\mathscr {R}(A, c)$ defined by Eq. (7.5) is bounded and its volume

$$\mathrm{Vol}(\mathscr {R}(A, c))=2^nd^{-1}c_1c_2\cdots c_n.$$

Proof

Let $A=(a_{ij})_{n\times n}$. Write $Ax=y$, then $x=A^{-1}y$. And let $A^{-1}=(b_{ij})_{n\times n}$, then for any $x_i$, there is

$$|x_i|=\left| \sum _{j=1}^n b_{ij}y_j \right| \le \sum _{j=1}^n|b_{ij}|\cdot c_j\le B,$$

where B is a bounded constant. Therefore, $\mathscr {R}(A, c)$ is a bounded set. Obviously

$$\mathrm{Vol}(\mathscr {R}(A, c))=\idotsint \limits _{x=(x_1, x_2, \ldots , x_n)\in \mathscr {R}(A, c)} \mathrm{d}x_1 \mathrm{d}x_2\cdots \mathrm{d}x_n,$$

do variable replacement $Ax=y$, then

$$\mathrm{d}x=\mathrm{d}x_1\cdots \mathrm{d}x_n=\frac{1}{|\mathrm{det}(A)|}\mathrm{d}y_1\mathrm{d}y_2\cdots \mathrm{d}y_n.$$

Thus

$$\begin{aligned} \mathrm{Vol}(\mathscr {R}(A, c))&=\frac{1}{|\mathrm{det}(A)|}\int \limits _{-c_1}^{c_1}\cdots \int \limits _{-c_n}^{c_n}\mathrm{d}y_1 \mathrm{d}y_2\cdots \mathrm{d}y_n\\&=2^nd^{-1}\prod _{i=1}^n c_i, \end{aligned}$$

Lemma 7.7 holds.

Remark 7.2

In (7.5), “$\le $” is changed to “<” to define $\mathscr {R}(A, c)$, and the above lemma is still holds.

Now consider the general situation, let $A=(a_{ij})_{m\times n}$. If $m>n$, and $\mathrm{rank}(A)\ge n$, then $\mathscr {R}(A, c)$ defined by Eq. (7.5) is still a bounded region. Obviously if $m<n$, or $m=n$, $\mathrm{rank}(A)<n$, then $\mathscr {R}(A, c)$ is an unbounded region, and $V=\infty $. Therefore, we have the following Corollary.

Corollary 7.1

Let $A=(a_{ij})_{m\times n}$, $m<n$ or $m=n$, $\det (A)=0$, then for any small positive vector $c=(c_1, c_2, \ldots , c_n), 0<c_i<\varepsilon $, $\mathscr {R}(A, c)$ contains a nonzero integer point. In other words, the following m inequalities

$$\left| \sum _{j=1}^na_{ij}x_j \right| <\varepsilon , 1\le i\le m.$$

There exists a nonzero integer solution $x=(x_1, x_2, \ldots , x_n)\in \mathbb {Z}^n$.

Proof

When $\varepsilon >0$ given, then $\mathrm{Vol}(\mathscr {R}(A, c))=\infty >2^n$. By Lemma 7.6, $\mathscr {R}(A, c)$ contains at least one nonzero zero point.

Let $A\in \mathbb {R}^{m\times n}$ be a matrix of order $m\times n$, $c=(c_1, c_2, \ldots , c_n)\in \mathbb {R}^n$ is a positive vector, that is $\forall ~ c_i>0$, write $A=(a_{ij})_{m\times n}$, $\mathscr {R}'(A, c)$ is defined as the set of solutions $x=(x_1, x_2, \ldots , x_n)$ of the following linear inequality:

$$\begin{aligned} {\left\{ \begin{aligned}&\left| \sum _{j=1}^na_{1j}x_j\right| \le c_1,\\&\left| \sum _{j=1}^n a_{ij}x_j \right| <c_i,\ \ i=2, \ldots , m. \end{aligned} \right. } \end{aligned}$$

(7.8)

When $A\in \mathbb {R}^{n\times n}$ is a reversible square matrix, we discuss the nonzero integral point in symmetric convex body $\mathscr {R}'(A, c)$.

Lemma 7.8

If $A\in \mathbb {R}^{n\times n}$ is a reversible matrix and $c=(c_1, c_2, \ldots , c_n)$ is a positive vector, when

$$\begin{aligned} c_1c_2\cdots c_n\ge |\mathrm{det}(A)|, \end{aligned}$$

(7.9)

Then $\mathscr {R}'(A, c)$ contains a nonzero integer point.

Proof

When $c_1c_2\cdots c_n>|\mathrm{det}(A)|$, because of

$$\mathrm{Vol}(\mathscr {R}'(A, c))=\frac{2^nc_1c_2\cdots c_n}{|\mathrm{det}(A)|}>2^n,$$

by Lemma 7.6 and 7.7, then the proposition holds, we only discuss the case when the equal sign of formula (7.9) holds.

Let $\varepsilon $ be any positive real number, $0<\varepsilon <1$, then by Lemma 7.7, there is a nonzero integral solution $x^{(\varepsilon )}=(x_1^{(\varepsilon )}, x_2^{(\varepsilon )}, \ldots , x_n^{(\varepsilon )})\in \mathbb {Z}^n$ satisfies

$$\begin{aligned} {\left\{ \begin{aligned}&\left| \sum _{j=1}^na_{1j}x_j^{(\varepsilon )} \right| \le c_1+\varepsilon \le c_1+1,\\&\left| \sum _{j=1}^na_{ij}x_j^{(\varepsilon )}\right| <c_i,\ \ 2\le i\le n. \end{aligned} \right. } \end{aligned}$$

(7.10)

And there is an upper bound B independent of $\varepsilon $, which satisfies

$$|x_j^{(\varepsilon )}|\le B,\ \ 1\le j\le n.$$

The integral point $x^{(\varepsilon )}$ satisfying the above bounded condition is finite, so there must be a nonzero integral point $x\not =0$, which holds (7.10) for any $\varepsilon >0$. Let $\varepsilon \rightarrow 0$, then the Lemma holds.

In the following discussion, we make the following restrictions on $\mathscr {R}\subset \mathbb {R}^n$:

$$\begin{aligned} \mathscr {R} \text{ is } \text{ a } \text{ symmetric } \text{ convex } \text{ body, } \mathscr {R} \text{ is } \text{ bounded, } \text{ and } \mathscr {R} \text{ is } \text{ a } \text{ closed } \text{ subset } \text{ of } \mathbb {R}^n\text{. } \end{aligned}$$

(7.11)

Obviously, when A is an n-order reversible square matrix, for any positive vector $c=(c_1, c_2, \ldots , c_n)$, $\mathscr {R}(A, c)$ satisfies the above restriction (7.11), but $\mathscr {R}'(A, c)$ does not because $\mathscr {R}'(A, c)$ is not closed.

Definition 7.2

If $\mathscr {R}\subset \mathbb {R}^n$ satisfies the restriction (7.11), then for any $x\in \mathbb {R}^n$, define the distance function F(x) as

$$\begin{aligned} F(x)=F_{\mathscr {R}}(x)=\inf \{ \lambda | \lambda >0, \lambda ^{-1}x\in \mathscr {R}\}. \end{aligned}$$

(7.12)

By definition, it is obvious that we have the following ordinary conclusions:

(i)

$F(x)=0 \Leftrightarrow x=0$;
(ii)

If A is a reversible n-order square matrix, the distance function defined by $\mathscr {R}(A, c)$ is
$$\begin{aligned} F(x)=\max _{1\le i\le n} c_i^{-1}\left| \sum _{j=1}^{n}a_{ij}x_j \right| . \end{aligned}$$

(G1.12')

Property (i) can be derived from the boundedness of $\mathscr {R}$, and property (ii) can be derived directly from the definition of $\mathscr {R}(A, c)$. Later we will see that $0\le F(x)<\infty $ holds for all $x\in \mathbb {R}^n$. The main property of distance function F(x) is the following Lemma.

Lemma 7.9

If F(x) is a distance function defined by $\mathscr {R}$ satisfying the constraints, then

(i)

Let $\lambda \ge 0$, then $x\in \lambda \mathscr {R}\Leftrightarrow F(x)\le \lambda $;
(ii)

$F(\lambda x)=|\lambda |F(x)$ holds for all $\lambda \in \mathbb {R}, x\in \mathbb {R}^n$;
(iii)

$F(x+y)\le F(x)+F(y), \forall ~x, y\in \mathbb {R}^n$.

Proof

Since $\mathscr {R}$ is closed, by the definition, $F^{-1}(x)x\in \mathscr {R}$. Thus, if $\lambda \ge F(x)$, by Lemma 7.3, then

$$\lambda ^{-1}x=\frac{F(x)}{\lambda }\cdot F^{-1}(x)x, \ \ \left| \frac{F(x)}{\lambda } \right| \le 1.$$

We have $\lambda ^{-1}x\in \mathscr {R}\Rightarrow x\in \lambda \mathscr {R}$. Conversely, if $\lambda <F(x)\Rightarrow \lambda ^{-1}x\not \in \mathscr {R}$. So when $x\in \lambda \mathscr {-R}$, there must be $\lambda \ge F(x)$, (i) holds.

(ii) is ordinary. Because $|\lambda |^{-1}F^{-1}(x)\lambda x\in \mathscr {R}$. There is

$$F(\lambda x)\le |\lambda |F(x).$$

Conversely, let $\delta =F(\lambda x)$, because of $\delta ^{-1}\lambda x\in \mathscr {R}$, you might as well let $\lambda >0$, thus

$$F(x)\le \frac{\delta }{\lambda }\Longrightarrow \lambda F(x)\le F(\lambda x).$$

So there is $F(\lambda x)=|\lambda |F(x)$, (ii) holds.

To prove (iii), we let $\mu _1=F(x), \mu _2=F(y),\Longrightarrow \mu _1^{-1}x\in \mathscr {R}, \mu _2^{-1}y\in \mathscr {R}$. By Lemma 7.4, we have

$$(\mu _1+\mu _2)^{-1}(x+y)=\frac{\mu _1}{\mu _1+\mu _2}(\mu _1^{-1}x) +\frac{\mu _2}{\mu _1+\mu _2}(\mu _2^{-1}y)\in \mathscr {R}.$$

Thus

$$F(x+y)\le \mu _1+\mu _2.$$

The Lemma holds.

Let the volume of $\mathscr {R}\in \mathbb {R}^n$ be $V>0$, there are n linearly independent vectors $\{\alpha _1, \alpha _2, \ldots , \alpha _n\}$ in $\mathscr {R}$ to form a set of bases of $\mathbb {R}^n$. For any real number $\mu _1, \mu _2, \ldots , \mu _n$, by Lemma 7.9, we have

$$\begin{aligned} \begin{aligned} F(\mu _1\alpha _1+\cdots +\mu _n\alpha _n)&\le |\mu _1|F(\alpha _1)+|\mu _2|F(\alpha _2)+\cdots +|\mu _n|F(\alpha _n)\\&\le |\mu _1|+|\mu _2|+\cdots +|\mu _n|. \end{aligned} \end{aligned}$$

Because $\alpha _i\in \mathscr {R}\Rightarrow F(\alpha _i)\le 1$, so the above formula holds. That proves for $\forall ~ x\in \mathbb {R}^n\Rightarrow F(x)\le \infty $.

Corollary 7.2

Let $\mathscr {R}\subset \mathbb {R}^n$ meet the limiting conditions (7.11), and $\mathrm{Vol}(\mathscr {R})>0$, then

(i)

$\forall ~x\in \mathbb {R}^n$, there is $\lambda $ such that $x\in \lambda \mathscr {R}$;
(ii)

Let $\{\alpha _1, \alpha _2, \ldots , \alpha _n\}\subset \mathscr {R}$ be a set of bases of $\mathbb {R}^n$, then
$$\left\{ \sum _{i=1}^n\mu _i\alpha _i | |\mu _1|+|\mu _2|+\cdots +|\mu _n|\le 1 \right\} \subset \mathscr {R}.$$

Proof

Because $F(x)<\infty $, so by (i) of Lemma 7.9, we can directly deduce the conclusion of (i) and (ii) given directly by Lemma 7.4.

Now let j be a subscript, and we define $\lambda _j$ as

$$\begin{aligned} \lambda _j=\min \{ \lambda \ge 0| \lambda \mathscr {R}\ \ \text{ contains } \text{ j } \text{ linear } \text{ independent } \text{ integral } \text{ points } \text{ in } \mathbb {R}^n\}, \end{aligned}$$

(7.13)

and $\lambda _j$ is called the jth continuous minimum of $\mathscr {R}$. By Lemma 7.3, $\lambda \mathscr {R}\subset \lambda '\mathscr {R}$, if $0\le \lambda \le \lambda '$. Therefore, $\lambda $ increases continuously, then $\lambda \mathscr {R}$ can always contain any set of desired vectors. Therefore, the existence of $\lambda _j$ is proof.

By Lemma 7.6, let V be the volume of $\mathscr {R}$, then $\mathrm{Vol}(\lambda \mathscr {R})=\lambda ^nV$, for the first continuous minimum $\lambda _1$, we have the following estimation

$$\begin{aligned} \lambda _1^nV\le 2^n. \end{aligned}$$

(7.14)

For $\lambda _j\ \ (j\ge 2)$, there is no explicit upper bound estimation, but we have the following conclusions.

Lemma 7.10

Let $\mathscr {R}\subset \mathbb {R}^n$ be a convex body satisfying the limiting condition (7.11), $V=\mathrm{Vol}(\mathscr {R})$, $\lambda _1, \lambda _2, \ldots , \lambda _n$ be n continuous minima of $\mathscr {R}$, then we have

$$\begin{aligned} \frac{2^n}{n!}\le V\lambda _1\lambda _2\cdots \lambda _n\le 2^n. \end{aligned}$$

(7.15)

Proof

We only prove the left inequality of the above formula, and we continuously select the linear independent whole point $x^{(1)}, x^{(2)}, \ldots , x^{(j)}$ such that $x^{(j)}\in \lambda _{j}\mathscr {R}$, and $x^{(j)}$ $x^{(1)}, x^{(2)}, \ldots , x^{(j-1)}$ is linearly independent. Let $x^{(j)}{=}(x_{j1}, x_{j2}, \ldots , x_{jn})\in \mathbb {Z}^n$. Because matrix $A=(x_{ji})_{n\times n}$ is an integer matrix, and $\det (A)\not =0$, so

$$|\det (A)|\ge 1.$$

By Lemma 7.9, for any constant $\mu _1, \mu _2, \ldots , \mu _n$, we have

$$\begin{aligned} \begin{aligned} F&(\mu _1x^{(1)}+\mu _2x^{(2)}+\cdots +\mu _nx^{(n)})\\&\le |\mu _1|F(x^{(1)})+|\mu _2|F(x^{(2)})+\cdots +|\mu _n|F(x^{(n)})\\&\le |\mu _1|\lambda _1+|\mu _2|\lambda _2+\cdots +|\mu _n|\lambda _n. \end{aligned} \end{aligned}$$

Thus, if $|\mu _1|\lambda _1+|\mu _2|\lambda _2+\cdots +|\mu _n|\lambda _n\le 1$, then

$$\mu _1x^{(1)}+\mu _2x^{(2)}+\cdots +\mu _nx^{(n)}\in \mathscr {R}.$$

So set

$$\mathscr {R}_1=\{ \mu _1x^{(1)}+\mu _2x^{(2)}+\cdots +\mu _nx^{(n)}| |\mu _1|\lambda _1+|\mu _2|\lambda _2+\cdots +|\mu _n|\lambda _n\le 1\}\subset \mathscr {R}.$$

The volume of the left set $\mathscr {R}_1$ is

$$\begin{aligned} \begin{aligned} \mathrm{Vol}(\mathscr {R}_1)&=\idotsint \limits _{|\mu _1|\lambda _1+|\mu _2|\lambda _2+\cdots +|\mu _n|\lambda _n\le 1}\mathrm{d}\mu _1 \mathrm{d}\mu _2\cdots \mathrm{d}\mu _n=\frac{2^n|\det (A)|}{n!\lambda _1\cdots \lambda _n}\\&\ge \frac{2^n}{n!\lambda _1\cdots \lambda _n}. \end{aligned} \end{aligned}$$

So there is

$$\frac{2^n}{n!\lambda _1\cdots \lambda _n}\le \mathrm{Vol}(\mathscr {R}_1)\le \mathrm{Vol}(\mathscr {R})=V.$$

Therefore, the left inequality of (7.15) holds. The proof of the right inequality is quite complex and is omitted here. Interested readers can refer to the classic works (1963, 1971) of J. W. S. Cassels.

An important application of the above geometry of numbers is to solve the problem of rational approximation of real numbers, which is called Diophantine approximation in classical number theory. The main conclusion of this section is the following simultaneous rational approximation theorem of n real numbers.

Theorem 7.1

Let $\theta _1, \theta _2, \ldots , \theta _n$ be any n real numbers, $\theta _i\not =0$, then for any positive number $N>1$, there are nonzero positive integers q and $p_1, p_2, \ldots , p_n$ to satisfy

$$\begin{aligned} {\left\{ \begin{aligned}&|q\theta _i-p_i|<N^{-\frac{1}{n}},\ \ 1\le i\le n;\\&|q|\le N. \end{aligned} \right. } \end{aligned}$$

(7.16)

Proof

The proof of the theorem is a simple application of Minkowski’s linear type theorem (see Lemma 7.8). Let $A\in \mathbb {R}^{(n+1)\times (n+1)}$ be an $(n+1)$-order reversible square matrix, defined as

$$\begin{aligned} A=\begin{pmatrix} -1&{}0&{}\cdots &{} \cdots &{} 0 &{} \theta _1\\ 0&{} -1&{} \cdots &{} \cdots &{} 0 &{} \theta _2\\ \cdots &{}\cdots &{}\cdots &{}\cdots &{}\cdots &{}\cdots \\ 0&{}0&{}\cdots &{}0&{}-1&{}\theta _n\\ 0&{}\cdots &{}\cdots &{}0&{}0&{}-1 \end{pmatrix}. \end{aligned}$$

Obviously $|\mathrm{det}(A)|=1$. Let $(n+1)$-dimensional positive vector $c=(N^{-\frac{1}{n}}, N^{-\frac{1}{n}}, \ldots $, $N^{-\frac{1}{n}}, N)$, because

$$c_1c_2\cdots c_nc_{n+1}=N^{-1}\cdot N=1\ge |\mathrm{det}(A)|.$$

So by Lemma 7.8, the symmetric convex body $\mathscr {R}'(A, c)$ defined by A and c has a nonzero integral point $x=(p_1, p_2, \ldots , p_n, q)\not =0$. We prove $q\not =0$. Because $x\not =0$, if $q=0$, then $p_k\not =0\ \ (1\le k\le n)$, therefore, the k-th inequality in Eq. (7.16) will produce the following contradiction,

$$1\le |q\theta _k-p_k|<N^{-\frac{1}{n}}<1.$$

So $q\not =0$, we complete the proof of Theorem 7.1.

Corollary 7.3

Let $\theta _1, \ldots , \theta _n$ be any n real numbers, then for any $\varepsilon >0$, there is rational number $\frac{p_i}{q}\ \ (1\le i\le n)$ satisfies

$$\begin{aligned} \left| \theta _i-\frac{p_i}{q}\right| <\frac{\varepsilon }{q}. \end{aligned}$$

(7.17)

Proof

Any $\varepsilon >0$ given, let $N^{-\frac{1}{n}}<\varepsilon $, Formula (7.17) can be derived directly from Theorem 7.1.

7.2 Basic Properties of Lattice

Lattice is one of the most important concepts in modern cryptography. Most of the so-called anti-quantum computing attacks are lattice based cryptosystems. What is a lattice? In short, a lattice is a geometry in n-dimensional Euclidean space $\mathbb {R}^n$, for example $L=\mathbb {Z}^n\subset \mathbb {R}^n$, then $\mathbb {Z}^n$ is a lattice in $\mathbb {R}^n$, which is called an integer lattice or a trivial lattice. If $\mathbb {Z}^n$ is rotated once, we get the concept of a general lattice in $\mathbb {R}^n$, which is a geometric description of a lattice, next, we give an algebraic precise definition of a lattice.

Definition 7.3

Let $L\subset \mathbb {R}^n$ be a nonempty subset, which is called a lattice in $\mathbb {R}^n$, if

(i)

L is an additive subgroup of $\mathbb {R}^n$;
(ii)

There is a positive constant $\lambda =\lambda (L)>0$, such that
$$\begin{aligned} \min \{|x| | x\in L, x\not =0\}=\lambda , \end{aligned}$$

(7.18)

$\lambda =\lambda (L)$ is called the minimal distance of a lattice L.

By Definition 7.3, a lattice is simply a discrete additive subgroup in $\mathbb {R}^n$, in which the minimum distance $\lambda =\lambda (L)$ is the most important mathematical quantity of the lattice. Obviously, we have

$$\begin{aligned} \lambda =\min \{|x-y| | x\in L, y\in L, x\not =y\}, \end{aligned}$$

(7.19)

Equation (7.19) shows the reason why $\lambda $ is called the minimal distance of a lattice. If $x\in L$ and $|x|=\lambda $, x is called the shortest vector of L.

In order to obtain a more explicit and concise mathematical expression of any lattice, we can regard an additive subgroup as a $\mathbb {Z}$-module. First, we prove that any lattice is a finitely generated $\mathbb {Z}$-module.

Lemma 7.11

Let $L\subset \mathbb {R}^n$ be a lattice and $\{\alpha _1, \alpha _2, \ldots , \alpha _m \}\subset L$ be a set of vectors in L, then $\{\alpha _1, \alpha _2, \ldots , \alpha _m \}$ is linearly independent in $\mathbb {R}$ if and only if $\{\alpha _1, \alpha _2, \ldots , \alpha _m \}$ is linearly independent in $\mathbb {Z}$.

Proof

If $\{\alpha _1, \alpha _2, \ldots , \alpha _m \}$ is linearly independent in $\mathbb {R}$, it is obviously linearly independent in $\mathbb {Z}$. conversely, if $\{\alpha _1, \alpha _2, \ldots , \alpha _m \}$ is linearly independent in $\mathbb {Z}$, that is, any linear combination

$$a_1\alpha _1+\cdots +a_m\alpha _m=0,\ \ a_i\in \mathbb {Z},$$

we have $a_1=a_2=\cdots =a_m=0$, then the linear combination in $\mathbb {R}$ is equal to 0, that is

$$\begin{aligned} \theta _1\alpha _1+\theta _2\alpha _2+\cdots +\theta _m\alpha _m=0, \ \ \theta _i\in \mathbb {R}. \end{aligned}$$

(7.20)

We prove $\theta _1=\theta _2=\cdots =\theta _m=0$. By Lemma 7.1, for sufficiently large $N>1$, there are positive integers $q\not =0$ and $p_1, p_2, \ldots , p_m$ such that

$$\begin{aligned} \left\{ \begin{aligned}&|q\theta _i-p_i|<N^{-\frac{1}{m}},\ \ 1\le i\le m;\\&q\le N. \end{aligned} \right. \end{aligned}$$

By (7.20), we have

$$\begin{aligned} \begin{aligned} |p_1\alpha _1+\cdots +p_m\alpha _m|&=|(q\theta _1-p_1)\alpha _1+\cdots +(q\theta _m-p_m)\alpha _m|\\&\le N^{-\frac{1}{m}}(|\alpha _1|+\cdots +|\alpha _m|)\\&\le N^{-\frac{1}{m}}\max _{1\le i\le m} |\alpha _i|. \end{aligned} \end{aligned}$$

Let $\lambda $ be the minimal distance of L and $\varepsilon >0$ be a sufficiently small positive number, we choose

$$N>\max \left\{ \varepsilon ^{-m}, \ \max _{1\le i\le m}\frac{ |\alpha _i|^m}{\lambda ^m}\right\} ,$$

then $N^{-\frac{1}{m}}<\varepsilon $, and

$$N^{-\frac{1}{m}}\max _{1\le i\le m} |\alpha _i|<\lambda .$$

Thus

$$|p_1\alpha _1+\cdots +p_m\alpha _m|<\lambda .$$

Notice that $p_1\alpha _1+\cdots +p_m\alpha _m\in L$, so $p_1\alpha _1+\cdots +p_m\alpha _m=0$. Since $\{\alpha _1, \alpha _2, \ldots , \alpha _m \}$ is linearly independent on $\mathbb {Z}$, $p_1=p_2=\cdots =p_m=0$ is derived. For any i, $1\le i\le m$, we get $|\theta _i|\le |q\theta _i|<N^{-\frac{1}{m}}<\varepsilon $. Since $\varepsilon $ is any small positive number, there is $\theta _1=\theta _2=\cdots =\theta _m=0$. This proves that $\{\alpha _1, \alpha _2, \ldots , \alpha _m \}$ is also linearly independent in $\mathbb {R}$. Lemma 7.11 holds.

From the above lemma, any lattice L in $\mathbb {R}^n$ is a finitely generated $\mathbb {Z}$-module. Let $\{\beta _1, \beta _2, \ldots , \beta _m\}\subset L$ be a set of $\mathbb {Z}$-bases in L, then L as the rank of $\mathbb {Z}$-module satisfies

$$\begin{aligned} \mathrm{rank}(L)=m\le n, \end{aligned}$$

(7.21)

and

$$\begin{aligned} L=\left\{ \sum _{i=1}^ma_i\beta _i | a_i\in \mathbb {Z}\right\} . \end{aligned}$$

(7.22)

If $\{\beta _1, \beta _2, \ldots , \beta _m\}$ is a $\mathbb {Z}$-basis of L and each $\beta _i$ is regarded as a column vector, then the matrix

$$B=[\beta _1, \beta _2, \ldots , \beta _m]\in \mathbb {R}^{n\times m}, \ \ \mathrm{rank}(B)=m.$$

Equation (7.22) can be written as

$$\begin{aligned} L=L(B)=\{Bx| x\in \mathbb {Z}^m\}\subset \mathbb {R}^n. \end{aligned}$$

(7.23)

We take L as the $\mathbb {Z}$-modules, m as the rank of lattice L, $B\in \mathbb {R}^{n\times m}$ as the generating matrix of lattice L, and $\{\beta _1, \beta _2, \ldots , \beta _m\}$ as a set of generating bases of L.

If $\{\alpha _1, \alpha _2, \ldots , \alpha _m \}\subset \mathbb {R}^n$ is any m column vectors in $\mathbb {R}^n$, the Gram matrix of $A=[\alpha _1, \alpha _2, \ldots , \alpha _m]\in \mathbb {R}^{n\times m}$, $\{\alpha _1, \alpha _2, \ldots , \alpha _m\}$ is defined as

$$T=(\langle \alpha _i, \alpha _j \rangle )_{m\times m}.$$

Obviously, we have $T=A'A$, where $A'$ is the transpose matrix of A.

Lemma 7.12

Let $A\in \mathbb {R}^{n\times m}$, $b\in \mathbb {R}^n$ ($m\le n$ is not required), then

(i)

Let $x_0\in \mathbb {R}^m$ be a solution of $A'Ax=A'b$, then
$$|Ax_0-b|^2=\min _{x\in \mathbb {R}^m}|Ax-b|^2.$$
(ii)

$\mathrm{rank}(A'A)=\mathrm{rank}(A)$, and homogeneous linear equations $Ax=0$ and $A'Ax=0$ have the same solution.
(iii)

$A'Ax=A'b$ always has a solution $x\in \mathbb {R}^m$, and when $\mathrm{rank}(A)=m$, the solution is unique
$$x=(A'A)^{-1}A'b.$$

Proof

First we prove (i). Let $x_0\in \mathbb {R}^m$ satisfies $A'Ax_0=A'b$, then for any $x\in \mathbb {R}^m$, we have

$$Ax-b=(Ax_0-b)+A(x-x_0)=\gamma +\gamma _1\in \mathbb {R}^n.$$

We prove that $\gamma $ and $\gamma _1$ are two orthogonal vectors in $\mathbb {R}^n$. Because

$$\begin{aligned} \begin{aligned} (A(x-x_0))'&(Ax_0-b)\\&=(x-x_0)'A'(Ax_0-b)\\&=(x-x_0)'(A'Ax_0-A'b)=0. \end{aligned} \end{aligned}$$

So $\gamma \bot \gamma _1$, by Pythagorean theorem, we have

$$|Ax-b|^2=|Ax_0-b|^2+|A(x-x_0)|^2\ge |Ax_0-b|^2.$$

So (i) holds.

To prove (ii), let $V_A$ be the solution space of $Ax=0$ and $V_{A'A}$ the solution space of $A'Ax=0$, let’s prove $V_A=V_{A'A}$. First, there is $V_A\subset V_{A'A}$. Conversely, let $x\in V_{A'A}$, that is $A'Ax=0$, then

$$x'A'Ax=0\Rightarrow (Ax)'Ax=\langle Ax, Ax \rangle =0.$$

The above formula holds if and only if $Ax=0$, so $x\in V_A$. There is $V_A=V_{A'A}$. Notice that

$$\begin{aligned} {\left\{ \begin{aligned}&\dim V_A=m-\mathrm{rank}(A)\\&\dim V_{A'A}=m-\mathrm{rank}(A'A). \end{aligned} \right. } \end{aligned}$$

So $\mathrm{rank}(A)=\mathrm{rank}(A'A)$, (ii) holds. To prove (iii), $b\in \mathbb {R}^n$ given, then the rank of the augmented matrix of linear equation system $A'Ax=A'b$ is

$$\begin{aligned} \begin{aligned} \mathrm{rank}[A'A, A'b]&=\mathrm{rank}(A'[A, b])\\&\le \mathrm{rank}(A')=\mathrm{rank}(A)=\mathrm{rank}(A'A). \end{aligned} \end{aligned}$$

Therefore, the augmented matrix and the coefficient matrix have the same rank, so the linear equations have solutions. When $\mathrm{rank}(A)=m$, then $\mathrm{rank}(A'A)=m$, that is, $A'A$ is a reversible m-order square matrix, thus

$$x=(A'A)^{-1}\cdot A'b, \Longrightarrow \text{ the } \text{ solution } \text{ is } \text{ unique. }$$

Lemma 7.12 holds!

Lemma 7.13

$A\in \mathbb {R}^{n\times m}$, and $\mathrm{rank}(A)=m$, then $A'A$ is a positive definite real symmetric matrix of order m, so there is a real orthogonal matrix $P\in \mathbb {R}^{m\times m}$ of order m satisfies

$$\begin{aligned} P'A'AP= \begin{vmatrix} \delta _1&\cdots&0\\ \vdots&\delta _2&\\ \vdots&\cdots&\ddots&\\ 0&&\delta _m \end{vmatrix}, \end{aligned}$$

(7.24)

where $\delta _i>0$ is the m eigenvalues of $A'A$.

Proof

$\mathrm{rank}(A)=m\Rightarrow m\le n$. Let $T=A'A$, then T is a symmetric matrix of order m. Let $x\in \mathbb {R}^m$ be m arguments, quadratic form

$$x'Tx=x'A'Ax=(Ax)'(Ax)=\langle Ax, Ax \rangle \ge 0.$$

Because $\mathrm{rank}(A)=m$, the above formula if and only if when $x=0$, $x'Tx=0$. So T is a positive definite matrix. From the knowledge of linear algebra, there is an orthogonal matrix of order m, $P\Rightarrow P'TP$ is a diagonal matrix, that is

$$P'TP=\mathrm{diag}\{\delta _1, \delta _2, \ldots , \delta _m \}.$$

Because $P'TP$ and T have the same eigenvalue, $\delta _1, \delta _2, \ldots , \delta _m$ is the eigenvalue of T, and $\forall ~ \delta _i>0$. The Lemma holds.

Lemma 7.12 is called the least square method in linear algebra, its significance is to find a vector $x_0$ with the shortest length in the set $\{Ax-b| x\in \mathbb {R}^m\}$ for a given $n\times m$-order matrix A and a given vector $b\in \mathbb {R}^n$. Lemma 7.12 gives an effective algorithm, that is, to solve the linear equations $A'Ax=A'b$, and $x_0$ is the solution of the equations, Lemma 7.13 is called the diagonalization of quadratic form. Now, the main results are as follows:

Theorem 7.2

Let $L\subset \mathbb {R}^n$ be a lattice, $\mathrm{rank}(L)=m\ (m\le n)$, if and only if there is a real matrix $B\in \mathbb {R}^{n\times m}$ of order $n\times m$, $\mathrm{rank}(B)=m$, such that

$$\begin{aligned} L=\{Bx| x\in \mathbb {Z}^m\}=\left\{ \sum _{i=1}^ma_i\beta _i | a_i\in \mathbb {Z}\right\} , \end{aligned}$$

(7.25)

where $B=[\beta _1, \beta _2, \ldots , \beta _m]$, each $\beta _i\in \mathbb {R}^n$ is a column vector.

Proof

Equation (7.23) proves the necessity of the condition, and we only prove the sufficiency of the condition. If a subset L in $\mathbb {R}^n$ is given by Eq. (7.25), it is obvious that L is an additive subgroup of $\mathbb {R}^n$, because any $\alpha =Bx_1, \beta =Bx_2$, where $x_1, x_2\in \mathbb {Z}^m$, then $x=x_1- x_2 \in \mathbb {Z}^n$, and

$$\alpha -\beta =B(x_1-x_2)=Bx\in L.$$

So we only prove the discreteness of L. Let $T=B'B$, then from Lemma 7.13, T is a positive definite real symmetric matrix, let $\delta _1, \delta _2, \ldots , \delta _m$ be the eigenvalue of T, then

$$\delta =\min \{\delta _1, \delta _2, \ldots , \delta _m\}>0.$$

We prove

$$\begin{aligned} \min _{\begin{array}{c} x\in \mathbb {Z}^m\\ x\not =0 \end{array}}|Bx|\ge \sqrt{\delta }>0. \end{aligned}$$

(7.26)

By Lemma 7.13, there is an orthogonal matrix P of order m such that

$$P'TP=\mathrm{diag}\{\delta _1, \delta _2, \ldots , \delta _m\}.$$

For any given $x\in \mathbb {Z}^m$, $x\not =0$. We have

$$|Bx|^2=x'Tx=x'P(P'TP)P'x\ge \delta |P'x|^2=\delta |x|^2.$$

Because $x\not =0$, then $|x|^2\ge 1$, so

$$|Bx|^2\ge \delta , \ \ \forall ~x\in \mathbb {Z}^m, \ \ x\not =0.$$

This shows that the distance between any two different points in L is $\ge \delta >0$. Therefore, in a sphere with 0 as the center and r as the radius, the number of points in L is finite. In these finite vectors, there is a $\alpha \in L, \Rightarrow $

$$|\alpha |=\min _{\begin{array}{c} x\in L\\ x\not =0 \end{array}}|x|=\lambda \ge \delta >0.$$

According to the definition of lattice, L is a lattice in $\mathbb {R}^n$, the Lemma holds.

It can be directly deduced from the above theorem

Corollary 7.4

Let $L=L(B)\subset \mathbb {R}^n$ be a lattice of $\mathrm{rank}(L)=m$, $\lambda $ be the minimum distance of L, $B\in \mathbb {R}^{n\times m}$, $\delta $ be the minimum eigenvalue of $B'B$, then $\lambda \ge \sqrt{\delta }$.

Definition 7.4

$L\subset \mathbb {R}^n$ is a lattice, and $\mathrm{rank}(L)=n$, call L is a full rank lattice of $\mathbb {R}^n$.

By Theorem 7.2, a sufficient and necessary condition for a full rank lattice with L as $\mathbb {R}^n$ is the existence of a reversible square matrix $B\in \mathbb {R}^{n\times n}$, $\mathrm{det}(B)\not =0$, such that

$$\begin{aligned} L=L(B)=\left\{ \sum _{i=1}^na_i\beta _i|a_i\in \mathbb {Z}, 1\le i\le n \right\} =\{Bx|x\in \mathbb {Z}^n\}. \end{aligned}$$

(7.27)

If $L=L(B)$ is a full rank lattice, define $d=d(L)$ as

$$\begin{aligned} d=d(L)=|\mathrm{det}(B)|, \end{aligned}$$

(7.28)

call d is the determinant of L. $d=d(L)$ is the second most important mathematical quantity of a lattice. The lattice we discuss below is always assumed to be a full rank lattice.

For a lattice (full rank lattice), the generating matrix is not unique, but $d=d(L)$ is unique. To prove this, first define the so-called unimodular matrix. Define

$$\begin{aligned} SL_n(\mathbb {Z})=\{A=(a_{ij})_{n\times n}| a_{ij}\in \mathbb {Z}, \mathrm{det}(A)=\pm 1\}, \end{aligned}$$

(7.29)

Obviously, $SL_n(\mathbb {Z})$ forms a group under the multiplication of the matrix, because the n-order identity matrix $I_n\in SL_n(\mathbb {Z})$, and $A_1\in SL_n(\mathbb {Z})$, $A_2\in SL_n(\mathbb {Z})$, then $A=A_1A_2\in SL_n(\mathbb {Z})$. Specially, if $A\in SL_n(\mathbb {Z})$,$A=(a_{ij})_{n\times n}$, then the inverse matrix of A

$$\begin{aligned} A^{-1}=\pm \begin{vmatrix} a_{11}^*&a_{12}^*&\cdots&a_{1n}^*\\ a_{21}^*&a_{22}^*&\cdots&a_{2n}^*\\ \cdots&\cdots&\cdots&\cdots \\ a_{n1}^*&\cdots&\cdots&a_{nn}^* \end{vmatrix}\in SL_n(\mathbb {Z}), \end{aligned}$$

where $a_{ij}^*$ is the algebraic cofactor of $a_{ij}$.

Lemma 7.14

$L=L(B)\subset \mathbb {R}^n$ is a lattice (full rank lattice), $B_1\in \mathbb {R}^{n\times n}$, then $L=L(B)=L(B_1)$ if and only if there is a unimodular matrix $U\in SL_n(\mathbb {Z})\Rightarrow B=B_1U$.

Proof

If $B=B_1U$, $U\in SL_n(\mathbb {Z})$, we prove $L(B)=L(B_1)$. Let $\alpha =B_1x\in L(B_1)$, where $x\in \mathbb {Z}^n$, then

$$\alpha =B_1x=B_1UU^{-1}x=BU^{-1}x.$$

Because of $U^{-1}x\in \mathbb {Z}^n$, then $\alpha \in L(B)$, that is $L(B_1)\subset L(B)$. Similarly, if $\alpha =Bx$, $x\in \mathbb {Z}^n$, then

$$\begin{aligned} \alpha =Bx=B_1Ux, \mathrm{where} ~Ux\in \mathbb {Z}^n. \end{aligned}$$

Thus, $\alpha \in L(B_1)$, that is $L(B)=L(B_1)$.

Conversely, if $L(B)=L(B_1)$, let $B=[\beta _1, \beta _2, \ldots , \beta _n]$, $B_1=[\alpha _1, \alpha _2, \ldots , \alpha _n]$, transition matrix

$$(\beta _1, \beta _2, \ldots , \beta _n)=(\alpha _1, \alpha _2, \ldots , \alpha _n)U.$$

Obviously by $\beta _i\in L(B_1) (1\le i\le n)$, U is an integer matrix. and

$$(\alpha _1, \alpha _2, \ldots , \alpha _n)=(\beta _1, \beta _2, \ldots , \beta _n)U_1.$$

Because $\alpha _i\in L(B) (1\le i\le n)$, $U_1$ also is an integer matrix. Because of

$$(\beta _1, \beta _2, \ldots , \beta _n)=(\alpha _1, \alpha _2, \ldots , \alpha _n)U=(\beta _1, \beta _2, \ldots , \beta _n)U_1U.$$

We have $U_1U=I_n$, thus $\mathrm{det}(U)=\pm 1$, that is $U\in SL_n(\mathbb {Z})$, $B=B_1U$, the Lemma holds.

By Lemma 7.14, B, $B_1$ are any two generating matrices of a lattice L, then

$$|\mathrm{det}(B)|=|\mathrm{det}(B_1)|=d=d(L).$$

That is, the determinant d(L) of a lattice is an invariant.

For a lattice (full rank lattice) $L\subset \mathbb {R}^n$, the dual lattice of L is defined as

$$\begin{aligned} L^*=\{\alpha \in \mathbb {R}^n| \langle \alpha , \beta \rangle \in \mathbb {Z}, \forall ~\beta \in L\}. \end{aligned}$$

(7.30)

Lemma 7.15

Let $L=L(B)$ be a lattice, then the dual lattice of L is $L^*=L((B^{-1})')$, that is, if B is the generating matrix of L, then $(B^{-1})'$ is the generating matrix of $L^*$.

Proof

Let

$$L((B^{-1})')=\{(B^{-1})'y| y\in \mathbb {Z}^n\}.$$

any $\alpha \in L((B^{-1})')$, $\alpha =(B^{-1})'y$, $y\in \mathbb {Z}^n$, $\beta \in L$, $\beta =Bx$, $x\in \mathbb {Z}^n$, then

$$ \langle \alpha , \beta \rangle =\alpha '\beta =y'B^{-1}Bx=y'x\in \mathbb {Z}.$$

That means $L((B^{-1})')\subset L^*$. Conversely, any $\alpha \in L^*$, for all $\beta \in L$, there is $ \langle \alpha , \beta \rangle \in \mathbb {Z}$. So let $B=[\beta _1, \beta _2, \ldots , \beta _n]$, then

$$\begin{aligned} \left\langle \alpha , \sum _{i=1}^nx_i\beta _i \right\rangle =\sum _{i=1}^nx_i \langle \alpha , \beta _i \rangle \in \mathbb {Z}, \forall ~x_i\in \mathbb {Z}, \end{aligned}$$

therefore, for each generating vector $\beta _i (1\le i\le n)$, there is $ \langle \alpha , \beta _i \rangle \in \mathbb {Z}$. Write $\alpha =(y_1, y_2, \ldots , y_n)$,

$$\begin{aligned} \langle \alpha , \beta _i \rangle \in \mathbb {Z}, \Longrightarrow B'\begin{pmatrix} y_1\\ \vdots \\ y_n \end{pmatrix}=\begin{pmatrix} x_1\\ \vdots \\ x_n \end{pmatrix}\in \mathbb {Z}^n. \end{aligned}$$

Thus

$$\begin{aligned} \begin{pmatrix} y_1\\ \vdots \\ y_n \end{pmatrix}\in (B')^{-1}\begin{pmatrix} x_1\\ \vdots \\ x_n \end{pmatrix}. \end{aligned}$$

That is $\alpha \in L((B')^{-1})$. Because $B\cdot B^{-1}=I_n, \Longrightarrow (B^{-1})'B'=I_n$, thus $(B^{-1})'=(B')^{-1}$. So $\alpha \in L((B^{-1})')$, that is $L^*\subset L((B^{-1})')$. We have $L^*=L((B^{-1})')$. The Lemma holds.

By Lemma 7.15, we immediately have the following corollary.

Corollary 7.5

Let $L=L(B)$ be a full rank lattice, $L^*$ is the dual lattice of L, then

$$d(L^*)=d^{-1}(L).$$

An equivalence relation in $\mathbb {R}^n$ can be defined by using a lattice L, for all $\alpha , \beta \in \mathbb {R}^n$, we define

$$\alpha \equiv \beta ({{\,\mathrm{mod}\,}}L) \Longleftrightarrow \alpha -\beta \in L.$$

Obviously, this is an equivalent relation, called the congruence relation of ${{\,\mathrm{mod}\,}}L$.

Definition 7.5

Let $F\subset \mathbb {R}^n$ be a subset, and call F the basic region of a lattice (full rank lattice) L, if

(i)

$\forall ~ x\in \mathbb {R}^n$, there is a $\alpha \in F\Rightarrow x\equiv \alpha ({{\,\mathrm{mod}\,}}L)$,
(ii)

Any $\alpha _1, \alpha _2\in F$, then $\alpha _1\not \equiv \alpha _2 ({{\,\mathrm{mod}\,}}L)$.

By definition, the basic neighborhood of a lattice is the representative element set of the additive quotient group $\mathbb {R}^n/L$. Therefore, a basic neighborhood of any L forms an additive group under ${{\,\mathrm{mod}\,}}L$.

Lemma 7.16

Let $L=L(B)$ be a full rank lattice, then

(i)

Any two basic neighborhoods $F_1$ and $F_2$ of L are isomorphic additive groups $({{\,\mathrm{mod}\,}}L)$.
(ii)

$F=\{Bx|x=(x_1, x_2, \ldots , x_n)', \text{ and }~0\le x_i<1, 1\le i\le n\}$ is a basic neighborhood of L(B).
(iii)

$\mathrm{Vol}(F)=d=d(L)$.

Proof

(i) is trivial, because

$$F_1\cong \mathbb {R}^n/L, F_2\cong \mathbb {R}^n/L, \Longrightarrow F_1\cong F_2.$$

To prove (ii), let $B=[\beta _1, \beta _2, \ldots , \beta _n]$, then $\{\beta _1, \beta _2, \ldots , \beta _n\}$ is a set of bases of $\mathbb {R}^n$, $\forall ~\alpha \in \mathbb {R}^n$, $\alpha $ can be expressed as a linear combination of $\beta _1, \beta _2, \ldots , \beta _n$, that is

$$\alpha =\sum _{i=1}^na_i\beta _i, \forall ~a_i\in \mathbb {R}.$$

Let $[\alpha ]_B=\sum _{i=1}^n [a_i]\beta _i$, $\{\alpha \}_B=\alpha -[\alpha ]_B$, then $\{\alpha \}_B$ can be expressed as

$$\begin{aligned} \{\alpha \}_B=B\begin{pmatrix} x_1\\ x_2\\ \vdots \\ x_n \end{pmatrix}, \text{ where }~0\le x_i<1, \ \ 1\le i\le n. \end{aligned}$$

That is $\{\alpha \}_B\in F$. Because $\alpha -\{\alpha \}_B=[\alpha ]_B\in L$, so for any $\alpha \in \mathbb {R}^n$, there is a $\{\alpha \}_B\in F$, such that

$$\alpha \equiv \{\alpha \}_B ({{\,\mathrm{mod}\,}}L).$$

And two points $\alpha =Bx$ and $\beta =By$ in F, then

$$\alpha -\beta =B(x-y)=Bz.$$

where $z=(z_1, z_2, \ldots , z_n)$, $|z_i|<1$. So $\alpha \not \equiv \beta ({{\,\mathrm{mod}\,}}L)$, if $\alpha \not =\beta $. So F is a basic neighborhood of L.

Let’s prove (iii). Because all basic neighborhoods of L are isomorphic, they have the same volume, (iii) gives a specific basic region F of L, so we can only prove $\mathrm{Vol}(F)=d=d(L)$. Obviously,

$$\mathrm{Vol}(F)=\idotsint \limits _{y=(y_1, y_2, \ldots , y_n)\in F} \mathrm{d}y_1\mathrm{d}y_2\cdots \mathrm{d}y_n$$

make variable substitution $Bx=y$ and calculate the Jacobi of the vector value

$$\mathrm{d}y_1\mathrm{d}y_2\cdots \mathrm{d}y_n={d}(\lambda )\mathrm{d}x_1\cdots \mathrm{d}x_n.$$

Thus

$$\mathrm{Vol}(F)=\int \limits _0^1\cdots \int \limits _0^1{d}(\lambda )\mathrm{d}x_1\cdots \mathrm{d}x_n=d(L).$$

We have completed the proof of Lemma 7.16.

Next, we discuss the gram Schmidt orthogonalization algorithm. If $B=[\beta _1, \beta _2, \ldots , \beta _n]$ is the generation matrix of L, $\{\beta _1, \beta _2, \ldots , \beta _n\}$ can be transformed into a set of orthogonal bases $\{\beta ^*_1, \beta ^*_2, \ldots , \beta ^*_n\}$, where $\beta _1^*=\beta _1$, and

$$\begin{aligned} \beta _i^*=\beta _i-\sum _{j=1}^{i-1} \frac{\langle \beta _i, \beta _j^* \rangle }{\langle \beta _j^*, \beta _j^* \rangle }\beta _j^*, \end{aligned}$$

(7.31)

$\{\beta ^*_1, \beta ^*_2, \ldots , \beta ^*_n\}$ is called the orthogonal basis corresponding to $\{\beta _1, \beta _2, \ldots , \beta _n\}$. $B^*=[\beta _1^*, \ldots , \beta _n^*]$ is the orthogonal matrix corresponding to B. For any $1\le i\le n$, denote

$$\begin{aligned} {\left\{ \begin{aligned}&u_{ii}=1, u_{ij}=0, \text{ when }~ j>i.\\&u_{ij}=\frac{\langle \beta _i, \beta _i^* \rangle }{|\beta _j^*|^2}, \text{ when }~ 1\le j\le i\le n.\\&U=(u_{ij})_{n\times n}. \end{aligned} \right. } \end{aligned}$$

(7.32)

Then U is a lower triangular matrix, and

$$\begin{aligned} \begin{pmatrix} \beta _1\\ \beta _2\\ \vdots \\ \beta _n \end{pmatrix} =U\begin{pmatrix} \beta _1^*\\ \beta _2^*\\ \vdots \\ \beta _n^* \end{pmatrix}. \end{aligned}$$

(7.33)

If both sides are transposed at the same time, there is

$$\begin{aligned} (\beta _1, \beta _2, \ldots , \beta _n)=(\beta ^*_1, \beta ^*_2, \ldots , \beta ^*_n)U'. \end{aligned}$$

(7.34)

Therefore, $U'$ is the transition matrix between two groups of bases.

Lemma 7.17

Let $L=L(B)\subset \mathbb {R}^n$ be a lattice, $B=[\beta _1, \beta _2, \ldots , \beta _n]$ is the generating matrix, $B^*=[\beta ^*_1, \beta ^*_2, \ldots , \beta ^*_n]$ is the corresponding orthogonal matrix, $d=d(L)$ is the determinant of L, then we have

$$\begin{aligned} d=\prod _{i=1}^n |\beta ^*_i|\le \prod _{i=1}^n |\beta _i|. \end{aligned}$$

(7.35)

Proof

By (7.24), we have $B=B^*U$, because $\det (U)=1$, so

$$\det (B)=\det (B^*).$$

By the definition,

$$\begin{aligned} \begin{aligned} d^2&=\det (B'B)=\det (U(B^*)'B^*U')\\&=\det ((B^*)'B^*)\\&=\det (\mathrm{diag}\{|\beta _1^*|^2, |\beta _2^*|^2, \ldots , |\beta _n^*|^2\}). \end{aligned} \end{aligned}$$

So there is

$$d=\prod _{i=1}^n |\beta _i^*|.$$

In order to prove the inequality on the right of Eq. (7.35), we only prove

$$\begin{aligned} |\beta _i^*|\le |\beta _i|, \ \ 1\le i\le n. \end{aligned}$$

(7.36)

Because $\beta _i=\sum _{j=1}^i u_{ij}\beta _j^*$, then

$$\begin{aligned} \begin{aligned} |\beta _i|^2&=\langle \beta _i, \beta _i \rangle = \left\langle \sum _{j=1}^i u_{ij}\beta _j^*, \sum _{j=1}^iu_{ij}\beta _j^* \right\rangle \\&=\sum _{j=1}^iu_{ij}^2 \langle \beta _j^*, \beta _j^* \rangle \\&=\langle \beta _i^*, \beta _i^* \rangle +\sum _{j=1}^{i-1} u_{ij}^2 \langle \beta _j^*, \beta _j^* \rangle . \end{aligned} \end{aligned}$$

Therefore, the inequality on the right of (7.35) holds, the Lemma is proved.

Equation (7.35) is usually called Hadamard inequality, and we give another proof here.

In order to define the concept of continuous minima on a lattice L, we record the minimum distance on L with $\lambda _1$. That is $\lambda _1=\lambda (L)$. Another definition of $\lambda _1$ is the minimum positive real number r, so that the linear space formed by $L\cap \mathrm{Ball}(0, r)$ is a one-dimensional space, where

$$\mathrm{Ball}(0, r)=\{x\in \mathbb {R}^n| |x|\le r \}$$

is a closed sphere with 0 as the center and r as the radius. The concept of n continuous minima $\lambda _1, \lambda _2, \ldots , \lambda _n$ in L can be given.

Definition 7.6

Let $L=L(B)\subset \mathbb {R}^n$ be a full rank lattice, the i-th continuous minimum $\lambda _i$ is defined as

$$\lambda _i=\lambda _i(L)=\inf \{r| \dim (\mathrm{span}(L\cap \mathrm{Ball}(0, r)))\ge i\}.$$

The following lemma is a useful lower bound estimate of the minimum distance $\lambda _1$.

Lemma 7.18

$L=L(B)\subset \mathbb {R}^n$ is a lettice (full rank lattice), $B^*=[\beta ^*_1, \beta ^*_2, \ldots , \beta ^*_n]$ is the corresponding orthogonal basis, then

$$\begin{aligned} \lambda _1=\lambda (L)\ge \min _{1\le i\le n} |\beta _i^*|. \end{aligned}$$

(7.37)

Proof

For $\forall ~ x\in \mathbb {Z}^n$, $x\not =0$, we prove

$$|Bx|\ge \min _{1\le i\le n}|\beta _i^*|, \ x\in \mathbb {Z}^n, \ x\not =0.$$

Let $x=(x_1, x_2, \ldots , x_n)\not =0$, j be the largest subscript $\Rightarrow x_j\not =0$, then

$$|\langle Bx, \beta _j^* \rangle |= \left| \left\langle \sum _{i=1}^jx_i\beta _j, \beta _j^* \right\rangle \right| =|x_j||\beta _j^*|^2.$$

Because when $i<j$,

$$\langle \beta _i, \beta _j^* \rangle =0, \; \text{ and } \; \langle \beta _j, \beta _j^* \rangle =\langle \beta _j^*, \beta _j^* \rangle .$$

On the other hand,

$$|\langle Bx, \beta _j^* \rangle |\le |Bx||\beta _j^*|.$$

$$|Bx|\ge |x_j||\beta _j^*|\ge \min _{1\le i\le n}|\beta _i^*|.$$

Lemma 7.18 holds!

Corollary 7.6

The continuous minimum $\lambda _1, \lambda _2, \ldots , \lambda _n$ of a lattice L is reachable, that is, it exists $\alpha _i\in L\Rightarrow |\alpha _i|=\lambda _i$, $1\le i\le n$.

Proof

The lattice points contained in ball $\mathrm{Ball}(0, \delta )$ with center 0 and radius $\delta \ (\delta >\lambda _i)$ are finite, because in a bounded region (finite volume), if there are infinite lattice points, there must be a convergent subsequence, but the distance between any different two points in L is greater than or equal to $\lambda _1$, which indicates that

$$|L\cap \mathrm{Ball}(0, \delta )| \langle \infty , \ \delta \rangle \lambda _i$$

In Sect. 7.1, the geometry of numbers is relative to the integer lattice $\mathbb {Z}^n$; next, we extend the main results to the general full rank lattice.

Lemma 7.19

(Compare with Lemma 7.5) $L=L(B)\subset \mathbb {R}^n$ is a lattice (full rank lattice), $\mathscr {R}\subset \mathbb {R}^n$, if $\mathrm{Vol}(\mathscr {R})>d(L)$, then there are two different points in $\mathscr {R}$, $\alpha \in \mathscr {R}$, $\beta \in \mathscr {R}\Rightarrow \alpha -\beta \in L$.

Proof

Let F be a basic region of L, that is

$$F=\{Bx| x=(x_1, \ldots , x_n), 0\le |x_i|<1, 1\le i\le n\}.$$

Obviously, $\mathbb {R}^n$ can be divided into the following disjoint subsets,

$$\begin{aligned} \begin{aligned} \mathbb {R}^n&=\cup _{\alpha \in L} \{\alpha +y| y\in F\}\\&=\cup _{\alpha \in L} \{\alpha + F\}. \end{aligned} \end{aligned}$$

For a given lattice point $\alpha \in L$, define

$$\mathscr {R}_{\alpha }=\mathscr {R}\cap \{\alpha +F\}=\alpha +D_{\alpha }, D_{\alpha }\subset F.$$

Therefore, $\mathscr {R}$ can be divided into the following disjoint subsets,

$$\mathscr {R}=\cup _{\alpha \in L}\mathscr {R}_{\alpha },\Rightarrow \mathrm{Vol}(\mathscr {R})=\sum _{\alpha \in L} \mathrm{Vol}(\mathscr {R}_{\alpha })=\sum _{\alpha \in L} \mathrm{Vol}(D_{\alpha }).$$

If for any $\alpha , \beta \in L$, $\alpha \not =\beta $, $D_{\alpha }\cap D_{\beta }=\varnothing $, then

$$\mathrm{Vol}(\mathscr {R})=\mathrm{Vol}(\cup _{\alpha \in L}D_{\alpha })\le \mathrm{Vol}(F)=d(L),$$

contradicts assumptions. So it must exist $\alpha , \beta \in L$, $\alpha \not =\beta , \Longrightarrow D_{\alpha }\cap D_{\beta }\ne \varnothing $. Let $x\in D_{\alpha }\cap D_{\beta }$, then $\alpha +x\in \mathscr {R}, \beta +x\in \mathscr {R}$. And

$$(\alpha +x)-(\beta +x)=\alpha -\beta \in L.$$

The Lemma holds.

Lemma 7.20

(Compare with 7.6) Let L be a full rank lattice, $\mathscr {R}\subset \mathbb {R}^n$ is a symmetric convex body. And $\mathrm{Vol}(\mathscr {R})>2^nd(L)$, then $\mathscr {R}$ contains a nonzero lattice point, that is $\exists ~\alpha \in L$, $\alpha \not =0,$ such that $\alpha \in \mathscr {R}$.

Proof

Let

$$\frac{1}{2}\mathscr {R}=\{x|2x\in \mathscr {R}\}.$$

Then

$$\mathrm{Vol}\left( \frac{1}{2}\mathscr {R}\right) =2^{-n}\mathrm{Vol}(\mathscr {R})>d(L).$$

By 7.19, there is $x\in \frac{1}{2}\mathscr {R}, y\in \frac{1}{2}\mathscr {R}, \Longrightarrow x-y\in L$. And because $2x\in \mathscr {R}$, $2y\in \mathscr {R}$, $\mathscr {R}$ is a symmetric convex body, by Lemma 7.4,

$$\frac{1}{2}(2x-2y)=x-y\in \mathscr {R}.$$

The Lemma holds.

Corollary 7.7

Let L be a full rank lattice, $\lambda (L)=\lambda _1$ is the minimum distance of L. Then

$$\begin{aligned} \lambda _1=\lambda (L)\le \sqrt{n}(d(L))^{\frac{1}{n}}. \end{aligned}$$

(7.38)

Proof

First we prove

$$\begin{aligned} \mathrm{Vol}(\mathrm{Ball}(0, r))\ge \left( \frac{2r}{\sqrt{n}} \right) ^n. \end{aligned}$$

(7.39)

This is because $\mathrm{Ball}(0, r)$ contains the following cubes

$$\left\{ x\in \mathbb {R}^n |x=(x_1, \ldots , x_n), \forall ~ |x_i|<\frac{r}{\sqrt{n}} \right\} \subset \mathrm{Ball}(0, r).$$

By the definition, there are no nonzero lattice points in open ball $\mathrm{Ball}(0, \lambda _1)$, by Lemma 7.20, because $\mathrm{Ball}(0, \lambda _1)$ is a symmetrical convex body, there is

$$\mathrm{Vol}(\mathrm{Ball}(0, \lambda _1))\le 2^nd(L).$$

Thus

$$\left( \frac{2\lambda _1}{\sqrt{n}}\right) ^n\le 2^{n}d(L).$$

That is

$$\lambda _1\le \sqrt{n}(d(L))^{\frac{1}{n}}.$$

The Corollary holds.

Combined with Eq. (7.37), we obtain the estimation of the upper and lower bounds of the minimum distance of a lattice,

$$\begin{aligned} \min _{1\le i\le n}|\beta _i^*|\le \lambda (L)\le \sqrt{n}(d(L))^{\frac{1}{n}}. \end{aligned}$$

(7.40)

Lemma 7.21

Let $L\subset \mathbb {R}^n$ be a lattice (full rank lattice), $\lambda _1, \lambda _2, \ldots , \lambda _n$ is the continuous minimum of L, $d=d(L)$ is the determinant of L, then

$$\begin{aligned} \lambda _1\lambda _2 \ldots \lambda _n\le n^{\frac{n}{2}}d(L). \end{aligned}$$

(7.41)

Proof

Let $\{\alpha _1, \alpha _2, \ldots , \alpha _n\}\subset L$, and $|\alpha _i|=\lambda _i$ is a set of bases of $\mathbb {R}^n$. Let

$$\begin{aligned} T=\left\{ y\in \mathbb {R}^n| \sum _{i=1}^n\left( \frac{ \langle y, \alpha _i^* \rangle }{\lambda _i|\alpha _i^*|}\right) ^2<1\right\} , \end{aligned}$$

(7.42)

where $\{\alpha _1^*, \alpha _2^*, \ldots , \alpha _n^*\}$ is the orthogonal basis corresponding to $\{\alpha _1, \alpha _2, \ldots , \alpha _n\}$. Let’s prove that T does not contain any nonzero lattice points. Let $y\in L$, $y\not =0$, let k be the largest subscript so that $|y|\ge \lambda _k$, then

$$y\in \mathrm{Span}(\alpha _1^*, \alpha _2^*, \ldots , \alpha _k^*)=\mathrm{Span}(\alpha _1, \alpha _2, \ldots , \alpha _k).$$

Because if y is linearly independent of $\alpha _1, \alpha _2, \ldots , \alpha _k$, then

$$k+1\le \dim (\mathrm{Span}(\alpha _1, \alpha _2, \ldots , \alpha _k, y)\cap \mathrm{Ball}(0, |y|)).$$

$\lambda _{k+1}\le |y|$ is obtained from the definition of $\lambda _{k+1}$, which contradicts the definition of k. By $y\in \mathrm{Span}(\alpha _1, \alpha _2, \ldots , \alpha _k)$,

$$\begin{aligned} \begin{aligned} \sum _{i=1}^n \left( \frac{\langle y, \alpha _i^* \rangle }{\lambda _i |\alpha _i^*|} \right) ^2&=\sum _{i=1}^k \left( \frac{ \langle y, \alpha _i^* \rangle }{\lambda _i|\alpha _i^*| } \right) ^2\\ \ge \frac{1}{\lambda _k^2}\sum _{i=1}^k\frac{ \langle y, \alpha _i^* \rangle ^2}{|\alpha _i^*|^2}&=\frac{1}{\lambda _k^2}|y|^2\ge 1. \end{aligned} \end{aligned}$$

Therefore $y\not \in T$, by Lemma 7.20, because T is a symmetric convex body, thus

$$\mathrm{Vol}(T)\le 2^nd.$$

On the other hand,

$$\begin{aligned} \begin{aligned} \mathrm{Vol}(T)&=\left( \prod _{i=1}^n\lambda _i \right) \cdot \mathrm{Vol}(\mathrm{Ball}(0, 1))\\&\ge \prod _{i=1}^n\lambda _i \left( \frac{2}{\sqrt{n}}\right) ^n. \end{aligned} \end{aligned}$$

$$\prod _{i=1}^n\lambda _i\le n^{\frac{n}{2}}d.$$

Lemma 7.21 holds.

The above lemma shows that the upper bound (7.38) of $\lambda _1$ is valid for $\lambda _i$ in the sense of geometric average.

Finally, we discuss the computational difficulties on the lattice. These problems are the main scientific basis and technical support in the design of trap gate function, and they are also the cornerstone of the security of lattice cryptography.

1.

Shortest vector problem SVP

Lattice L is a discrete geometry in $\mathbb {R}^n$, we know that its minimum distance $\lambda _1=\lambda (L)$ is the length of the shortest vector in L. How to find its shortest vector $u_0\in L $ for any full rank lattice $L, \Longrightarrow $

$$|u_0|=\min _{x\in L, x\not =0}|x|=\lambda _1.$$

It is the so-called shortest vector calculation problem. At present, there are insurmountable difficulties in theory and calculation, because we only know the existence of $u_0$, but we can’t calculate $u_0$. Second, the current main research focuses on the approximation of the shortest vector. The so-called shortest vector approximation is to find a nonzero vector $u\in L$ on $L, \Longrightarrow $

$$|u|\le r(n)\lambda _1, \ u\in L, u\not =0,$$

where $r(n)\ge 1$ is called the approximation coefficient, which only depends on the dimension of lattice L.

In 1982, H. W. Lenstra, A. K. Lenstra and L. Lovasz creatively developed a set of algorithms in (1982) to effectively solve the approximation problem of the shortest vector, which is the famous LLL algorithm in lattice theory. The computational complexity of LLL algorithm is polynomial for the whole lattice, and the approximation coefficient $r(n)=2^{\frac{n-1}{2}}$. How to improve the approximation coefficient in LLL algorithm to the polynomial coefficient of n is the main research topic at present. For example, Schnorr’s work in 1987 and Gama and Nguyen’s work (2008a, 2008b) are very representative, but they are still far from the polynomial function, so the academic circles generally speculate:

Conjecture 1: there is no polynomial algorithm that can approximate the shortest vector so that the approximation coefficient r(n) is a polynomial function of n.

2.

Closest vector problem CVP

Let $L\subset \mathbb {R}^n$ be a lattice, $t\in \mathbb {R}^n$ is an arbitrary given vector, and it is easy to prove that there is a lattice point $u_t\in L, \Longrightarrow $

$$|u_t-t|=\min _{x\in L}|x-t|,$$

$u_t$ is called the nearest lattice point (vector) of t. When $t=0$ is a zero vector, $u_0$ is the shortest vector of L, so the adjacent vector problem is a general form of the shortest vector problem. Similarly, we only know the existence of the adjacent vector $u_t$, and there is no definite algorithm to find $u_t$ instead of the approximation problem of the adjacent vector, $x\in L$, if

$$|x-t|\le r_1(n)|u_t-t|,$$

then x is called the approximation coefficient, which is the approximation adjacent vector of $r_1(n)$, in 1986, Babai proposed an effective algorithm to approximate the adjacent vector in Babai (1986), and its approximation coefficient $r_1(n)$ is generally of the same order as the approximation coefficient r(n) of the shortest vector.

There are many other difficult computational problems on lattice, such as the Successive Shortest vector problem, which is essentially to find a deterministic algorithm to approximate each $\alpha _i\in L$, where $|\alpha _i|=\lambda _i$ is the continuous minimum of L. However, SVP and CVP are commonly used in lattice cryptosystem design and analysis, and most of the research is based on the integer lattice.

7.3 Integer Lattice and q-Ary Lattice

Definition 7.7

A full rank lattice L is called an integer lattice, if $L\subset \mathbb {Z}^{n}$, an integer lattice L is called a q-ary lattice, if $q \mathbb {Z}^{n} \subset L\subset \mathbb {Z}^{n}$, where $q\ge 1$ is a positive integer.

It is easy to see from the definition that a lattice $L=L(B)$ is an integer lattice $\Leftrightarrow B \in \mathbb {Z}^{n\times n}$ is an integer square matrix, so the determinant $d=d(L)$ of an entire lattice L is a positive integer.

Lemma 7.22

Let $L=L(B) \subset \mathbb {Z}^{n}$ be an integer lattice, $d=d(L)$ is the determinant of L, then $d \mathbb {Z}^{n} \subset L\subset \mathbb {Z}^{n}$, therefore, an integer lattice is always a d-ary lattice $(d=q)$.

Proof

Let $\alpha \in d\mathbb {Z}^{n}$, let’s prove that $\alpha \in L$, that is, $\alpha =Bx$ always has the solution of the entire vector $x \in \mathbb {Z}^{n}$. Let $B^{-1}$ be the inverse matrix of B, then

$$\begin{aligned} B^{-1}=\frac{1}{\det (B)}B^{*} =\frac{1}{\det (B)} \left[ \begin{array}{ccccc} b_{11}^{*} &{} b_{12}^{*} &{}\cdots &{} b_{1n}^{*} \\ b_{21}^{*} &{}b_{22}^{*} &{}\cdots &{} b_{2n}^{*} \\ \cdots &{}\cdots &{}\cdots &{}\cdots \\ b_{n1}^{*} &{}b_{n2}^{*} &{}\cdots &{} b_{nn}^{*}\\ \end{array}\right] , \end{aligned}$$

where $B=(b_{ij})_{n\times n}$, $b_{ij}^{*}$ is the algebraic cofactor of $b_{ij}$. Because $B \in \mathbb {Z}^{n\times n}$, so $B^{*} \in \mathbb {Z}^{n\times n}$, thus $dB^{-1}= \pm B^{*} \in \mathbb {Z}^{n\times n}$, write $\alpha =d\beta $, then $\beta \in \mathbb {Z}^{n}$, and

$$\begin{aligned} x=B^{-1}\alpha =dB^{-1}\beta =\pm B^{*}\beta \in \mathbb {Z}^{n}. \end{aligned}$$

Thus $\alpha \in L$. That is $d \mathbb {Z}^{n} \subset L$, the Lemma holds.

The following lemma is a simple conclusion in algebra. For completeness, we prove the following.

Lemma 7.23

Let L be a q-ary lattice, $\mathbb {Z}_{q}$ is the residual class rings ${{\,\mathrm{mod}\,}}q$, then

(i)

$\mathbb {Z}^{n}/q\mathbb {Z}^{n} \cong \mathbb {Z}_{q}^{n}$ (additive group isomorphism).
(ii)

$\mathbb {Z}^{n}/L \cong \mathbb {Z}_{q}^{n} /_{L/q\mathbb {Z}^{n}}$ (additive group isomorphism). Therefore, $L/q\mathbb {Z}^{n}$ is a linear code on $\mathbb {Z}_{q}^{n}$.

Proof

$\alpha =(a_{1},a_{2},\ldots ,a_{n})\in \mathbb {Z}^{n}$, $\beta =(b_{1},b_{2},\ldots ,b_{n})\in \mathbb {Z}^{n}$, if $\forall ~a_{i}\equiv b_{i}({{\,\mathrm{mod}\,}}q)$, we write $\alpha \equiv \beta ({{\,\mathrm{mod}\,}}q)$. For any $\alpha \in \mathbb {Z}^{n}$, define

$$\begin{aligned} \bar{\alpha }=(\bar{a_{1}},\bar{a_{2}},\ldots ,\bar{a_{n}})\in \mathbb {Z}_{q}^{n}, \end{aligned}$$

where $\bar{a_{i}}$ is the minimum nonnegative residue of $a_{i} {{\,\mathrm{mod}\,}}q$, and thus, we have $\alpha \equiv \bar{\alpha }({{\,\mathrm{mod}\,}}q)$. Define mapping $\sigma : \mathbb {Z}^{n} {\mathop {\longrightarrow }\limits ^{\sigma }} \mathbb {Z}_{q}^{n}$ as $\sigma (\alpha )= \bar{\alpha }$, this is a surjection, and

$$\begin{aligned} \sigma (\alpha +\beta )= \bar{\alpha }+\bar{\beta }=\sigma (\alpha )+\sigma (\beta ). \end{aligned}$$

Therefore, $\sigma $ is a full group homomorphism. Obviously $\mathrm{Ker} \sigma =q\mathbb {Z}^{n}$, therefore, by the isomorphism theorem of groups, we have

$$\begin{aligned} \mathbb {Z}^{n}/q\mathbb {Z}^{n} \cong \mathbb {Z}_{q}^{n}. \end{aligned}$$

Because of $q\mathbb {Z}^{n} \subset L \subset \mathbb {Z}^{n}$, then by the isomorphism theorem of groups,

$$\begin{aligned} \mathbb {Z}^{n}/L \cong \mathbb {Z}^{n}/q\mathbb {Z}^{n} /L/q\mathbb {Z}^{n}\cong \mathbb {Z}_{q}^{n} /L/q\mathbb {Z}^{n}. \end{aligned}$$

The Lemma holds.

Next, we will prove that $\mathbb {Z}^{n}/L $ is a finite group. Therefore, we first discuss the elementary transformation of matrix. The so-called elementary transformation of matrix refers to elementary row transformation and elementary column transformation, specifically refers to the following three kinds of elementary transformations:

(1)

Transform two rows or two columns of matrix A:
$$\begin{aligned} {\left\{ \begin{aligned}&\sigma _{ij}(A)\text {-Transform rows }i~\text {and }j~\text {of }A \\&\tau _{ij}(A)\text {-Transform columns }i~\text {and}~j~\text {of }A\\ \end{aligned} \right. } \end{aligned}$$
(2)

A row or column multiplied by $-1$ by A:
$$\begin{aligned} {\left\{ \begin{aligned}&\sigma _{-i}(A)\text {-Multiply row }i~\text {of }A~\text {by }- 1 \\&\tau _{-i}(A)\text {-Multiply column }i~\text {of }A~\text {by }- 1 \\ \end{aligned} \right. } \end{aligned}$$
(3)

Add the k times of a row (column) to another row (column), $k\in \mathbb {R}$, in many cases, we require $k \in \mathbb {Z}$ to be an integer:
$$\begin{aligned} {\left\{ \begin{aligned}&\sigma _{ki+j}(A)\text {-Add }k~\text {times of row }i~\text {of }A~\text {to row }j \\&\tau _{ki+j}(A)\text {-Add } k~\text {times of column }i~\text {of }A~\text {to column }j \\ \end{aligned} \right. } \end{aligned}$$

The n-order identity matrix is represented by $I_{n}$, the matrix obtained by the above elementary transformation of $I_{n}$ is called elementary matrix. We note that all elementary matrices are unimodular matrices (see (7.29)), and

$$\begin{aligned} {\left\{ \begin{aligned}&\sigma _{ij}(A)=\sigma _{ij}(I_{n})A, ~\tau _{ij}(A)=A\tau _{ij}(I_{n})\\&\sigma _{-i}(A)=\sigma _{-i}(I_{n})A, ~\tau _{-i}(A)=A\tau _{-i}(I_{n})\\&\sigma _{ki+j}(A)=\sigma _{ki+j}(I_{n})A, ~\tau _{ki+j}(A)=A\tau _{ki+j}(I_{n})\\ \end{aligned} \right. } \end{aligned}$$

(7.43)

That is, elementary row transformation for A is equal to multiplying the corresponding elementary matrix from the left, and elementary column transformation for A is equal to multiplying the corresponding elementary matrix from the right.

Lemma 7.24

Let $L=L(B)\subset \mathbb {Z}^{n}$ be an integer lattice, then $\mathbb {Z}^{n}/L$ is a finite group, and

$$\begin{aligned} |\mathbb {Z}^{n}/L|=d(L). \end{aligned}$$

Proof

According to the knowledge of linear algebra, an integer square matrix $B\in \mathbb {Z}^{n}$ can always be transformed into a lower triangular matrix by elementary row transformation; that is, there is a unimodular matrix $U \in SLn(\mathbb {Z}),$ so that

$$\begin{aligned} UB = \left[ \begin{array}{ccccc} * &{} 0 &{}\cdots &{} 0 \\ * &{} * &{}\cdots &{} 0 \\ \cdots &{}\cdots &{}\cdots &{}\cdots \\ * &{}* &{}\cdots &{}* \\ \end{array}\right] . \end{aligned}$$

Then the elementary column transformation of UB can always be transformed into an upper triangular matrix, so it is a diagonal matrix; that is, there is a unimodular matrix $U_{1} \in SLn (\mathbb {Z}), \Rightarrow $

$$\begin{aligned} UBU_{1}=\mathrm{diag}\{\delta _{1}, \delta _{2},\ldots , \delta _{n}\}. \end{aligned}$$

where $\delta _{i} \ne 0$, $\delta _{i} \in \mathbb {Z}$, and

$$\begin{aligned} d(L)=|\det (UBU_{1})|=\prod _{i=1}^{n} |\delta _{i}|. \end{aligned}$$

Let $L(UBU_{1})$ be an integral lattice generated by $UBU_{1}$, we have quotient group isomorphism

$$\begin{aligned} \mathbb {Z}^{n}/L(UBU_{1}) \cong \oplus _{i=1}^{n}\mathbb {Z}/{|\delta _{i}|\mathbb {Z}}= \oplus _{i=1}^{n}\mathbb {Z}_{|\delta _{i}|}. \end{aligned}$$

Thus

$$\begin{aligned} |\mathbb {Z}^{n}/L(UBU_{1})| =\prod ^{n}_{i=1}|\delta _{i}| =d(L). \end{aligned}$$

Because of $L(B)=L(BU_{1})$ and $ L(B)\cong L(UB)$, Thus $L(B) \cong L(UBU_{1})$, so

$$\begin{aligned} |\mathbb {Z}^{n}/L(B)| =|\mathbb {Z}^{n}/L(UBU_{1})| =d(L). \end{aligned}$$

Lemma 7.24 holds.

An integer square matrix $B=(bij)_{n\times n} \in \mathbb {Z}^{n\times n}$ is called Hermite normal form matrix, if B is an upper triangular matrix, that is $b_{ij}=0$, $1\le j<i\le n$, and

$$\begin{aligned} b_{ii}\ge 1,0\le b_{ij}<b_{ii},1\le i<j\le n. \end{aligned}$$

(7.44)

A Hermite normal form matrix, referred to as HNF matrix.

Definition 7.8

$L=L(B)\subset \mathbb {Z}^{n}$ is an integer lattice, and B is the HNF matrix, which is called the HNF basis of L, denote as $B=\mathrm{HNF}(L)$.

The following lemma proves that a whole lattice has a unique HNF basis, so it is reasonable to use $\mathrm{HNF}(L)$ to represent HNF basis.

Lemma 7.25

Let $L \subset \mathbb {Z}^{n}$ be an integer lattice, then there is a unique HNF matrix $B\Rightarrow L=L(B)$.

Proof

Let $L=L(A)$, A is the generating matrix of L, by using the elementary column transformation, A can be transformed into an upper triangular matrix, that is

$$\begin{aligned} AU_{1}= \left[ \begin{array}{ccccc} c_{11} &{} c_{12} &{}\cdots &{} c_{1n} \\ 0 &{}c_{22} &{}\cdots &{} c_{2n} \\ \cdots &{}\cdots &{}\cdots &{}\cdots \\ 0 &{}0 &{}\cdots &{} c_{nn} \\ \end{array}\right] , \ ~U_{1}\in SLn(\mathbb {Z}). \end{aligned}$$

where $C_{ii}>0$, $1\le i\le n$, if $AU_{1}$ is transformed continuously, there is a unimodular matrix $U_{2}, \Rightarrow AU_{1}U_{2}=B$ is the HNF matrix, because $L(B)=L(AU_{1}U_{2})$, know that L has HNF base B.

Let’s prove the uniqueness of HNF base B if there are two HNF matrices $B_{1}$,$B_{2}\Rightarrow L(B_{1})=L(B_{2})$, then from Lemma 7.14, there is a unimodular matrix $U \in SLn(\mathbb {Z})$ such that $B_{1}=B_{2}U$; that is, the elementary column transformation defined by formula (7.43) can be continuously implemented on $B_{2}$ to obtain $B_{1}$, but for $B_{2}$, any column transformation $\tau _{ij}$ ,$\tau _{-i}$ and $\tau _{ki+j}$ is not a HNF matrix, so $U=I_{n}$ is a unit matrix, that is $B_{1}=B_{2}$. The Lemma holds.

Lemma 7.26

Let $L=L(B)$ be an integer lattice, $B=(b_{ij})_{n\times n}$ is a HNF matrix, $B^{*} =[\beta _{1}^{*}, \beta _{2}^{*}, \ldots , \beta _{n}^{*}]$ is the orthogonal basis corresponding to $B=[\beta _{1}, \beta _{2}, \ldots , \beta _{n}]$, then

$$\begin{aligned} B^{*}=[\beta _{1}^{*} , \beta _{2}^{*} ,\ldots , \beta _{n}^{*}]=\mathrm{diag}\{b_{11},b_{22},\ldots , b_{nn}\} \end{aligned}$$

is a diagonal matrix.

Proof

We prove $\beta _{i}^{*}=(0, 0, \ldots , b_{ii}, 0, \ldots , 0)^{'}$, induction of i, when $i=1$, $\beta _{1}^{*}=\beta _{1}=(b_{11}, 0,\ldots , 0)^{'}$. The proposition holds, if for $j\le i$, there is $\beta _{j}^{*}=(0,0,\ldots , b_{jj}, 0, \ldots , 0)^{'}$ holds, then when $i+1$, by (7.31), there is

$$\begin{aligned} \begin{aligned} \beta _{i+1}^{*}&=\beta _{i+1}-\sum _{j=1}^{i}\frac{\langle \beta _{i+1},\beta _{j}^{*}\rangle }{|\beta _{j}^{*}|^{2}}\beta _{j}^{*}\\&=\beta _{i+1}-\sum _{j=1}^{i}\frac{b_{j(i+1)}}{b_{jj}}\beta _{j}^{*}\\&=\begin{pmatrix} b_{1(i+1)}\\ b_{2(i+1)}\\ \vdots \\ b_{i(i+1)}\\ b_{(i+1)(i+1)}\\ \vdots \\ 0 \end{pmatrix} - \begin{pmatrix} b_{1(i+1)}\\ b_{2(i+1)}\\ \vdots \\ b_{i(i+1)}\\ 0\\ \vdots \\ 0 \end{pmatrix} = \begin{pmatrix} 0\\ 0\\ \vdots \\ 0\\ b_{(i+1)l(i+1)}\\ \vdots \\ 0 \end{pmatrix}. \end{aligned} \end{aligned}$$

Thus the proposition holds.

Next, we discuss q-ary lattices, where $q\ge 1$ is a positive integer, the following two q-ary lattices are often used in lattice cryptosystems.

Definition 7.9

Let $\mathbb {Z}_{q}$ be a residue class ring ${{\,\mathrm{mod}\,}}q$, $A \in \mathbb {Z}_{q}^{n\times m}$, the following two q-ary lattices are defined as

$$\begin{aligned} \Lambda _{q}(A)=\{y \in \mathbb {Z}^{m}| \text {there is} ~x \in \mathbb {Z}^{n} \Rightarrow y\equiv A^{'}x ({{\,\mathrm{mod}\,}}q)\}, \end{aligned}$$

(7.45)

and

$$\begin{aligned} \Lambda _{q}^{\bot }(A)=\{y \in \mathbb {Z}^{m}|Ay\equiv 0 ({{\,\mathrm{mod}\,}}q)\}. \end{aligned}$$

(7.46)

By the definition: $ \Lambda _{q}(A)\subset \mathbb {Z}^{m}$ and $ \Lambda _{q}^{\bot }(A)\subset \mathbb {Z}^{m}$ is an m-dimensional integer lattice. And any $\alpha \in q\mathbb {Z}^{m}$, then there is $x=0 \in \mathbb {Z}^{m}, \Rightarrow \alpha \equiv A^{'}x({{\,\mathrm{mod}\,}}q)$, and $A\alpha \equiv 0({{\,\mathrm{mod}\,}}q)$, there is

$$\begin{aligned} {\left\{ \begin{aligned}&q\mathbb {Z}^{m}\subset \Lambda _{q}(A) \subset \mathbb {Z}^{m}\\&q\mathbb {Z}^{m}\subset \Lambda _{q}^{\perp } (A)\subset \mathbb {Z}^{m}.\\ \end{aligned} \right. } \end{aligned}$$

That is, $\Lambda _{q}(A)$ and $\Lambda _{q}^{\perp }(A)$ are q-element lattices of dimension m.

Lemma 7.27

We have

$$\begin{aligned} {\left\{ \begin{aligned}&\Lambda _{q}^{\perp }(A)=q\Lambda _{q}(A) ^{*}\\&\Lambda _{q}(A)=q \Lambda _{q}^{\perp }(A)^{*}\\ \end{aligned} \right. } \end{aligned}$$

Proof

Any $\alpha \in \Lambda _{q}(A) ^{*}$, by the definition, then

$$\begin{aligned} \langle y, \alpha \rangle \in \mathbb {Z}, \forall ~y\in \Lambda _{q}(A). \end{aligned}$$

And

$$\begin{aligned} \langle y, \alpha \rangle =y^{'}\alpha \in \mathbb {Z}\Rightarrow y^{'}\alpha \equiv 0({{\,\mathrm{mod}\,}}1). \end{aligned}$$

There is

$$\begin{aligned} y^{'}q\alpha \equiv 0({{\,\mathrm{mod}\,}}q), \forall ~ y\in \Lambda _{q}(A). \end{aligned}$$

Because $y\in \Lambda _{q}(A)$, thus there is $x \in \mathbb {Z}^{n}\Rightarrow y\equiv A^{'}x({{\,\mathrm{mod}\,}}q)$, from the above formula,

$$\begin{aligned} x^{'}Aq\alpha \equiv 0({{\,\mathrm{mod}\,}}q), \forall ~ x\in \mathbb {Z}^{n}. \end{aligned}$$

Thus

$$\begin{aligned} Aq\alpha \equiv 0({{\,\mathrm{mod}\,}}q),\Rightarrow q\alpha \in \Lambda _{q}^{\perp }(A). \end{aligned}$$

We prove

$$\begin{aligned} q\Lambda _{q}(A) ^{*}\subset \Lambda _{q}^{\perp }(A). \end{aligned}$$

Conversely, if $y\in \Lambda _{q}^{\perp }(A)$, we have

$$\begin{aligned} Ay \equiv 0({{\,\mathrm{mod}\,}}q)\Rightarrow A \left( \frac{1}{q}y\right) \equiv 0({{\,\mathrm{mod}\,}}1). \end{aligned}$$

Any $\alpha \in \Lambda _{q}(A)$, let $x \in \mathbb {Z}^{n}$, $\alpha \equiv A^{'}x ({{\,\mathrm{mod}\,}}q)$, then

$$\begin{aligned} \left\langle \alpha ,\frac{1}{q}y \right\rangle =x^{'}A \left( \frac{1}{q}y\right) \equiv 0({{\,\mathrm{mod}\,}}1), \forall ~ x\in \mathbb {Z}^{n}. \end{aligned}$$

We have

$$\begin{aligned} \frac{1}{q}y\in \Lambda _{q}(A) ^{*}\Rightarrow y \in q\Lambda _{q}(A)^{*}. \end{aligned}$$

That is

$$\begin{aligned} \Lambda _{q}^{\perp }(A)\subset q\Lambda _{q}(A)^{*}. \end{aligned}$$

Thus, $\Lambda _{q}^{\perp }(A)= q\Lambda _{q}(A)^{*}$. Similarly, the second equation can be proved.

Lemma 7.28

Let q be a prime, $A\in \mathbb {Z}_{q}^{n\times m}, m \ge n$, and $\mathrm{rank}(A)=n$, then

$$\begin{aligned} |\det (\Lambda _{q}^{\perp }(A))|=q^{n}, \end{aligned}$$

(7.47)

and

$$\begin{aligned} |\det (\Lambda _{q}(A))|=q^{m-n}. \end{aligned}$$

(7.48)

Proof

In finite field $\mathbb {Z}_{q}$, $\mathrm{rank}(A)=n$, then the linear equation system $Ay=0$ has exactly $q^{m-n}$ solutions, from which we can get

$$\begin{aligned} |\Lambda _{q}^{\perp }(A)/q\mathbb {Z}^{m}|=q^{m-n}. \end{aligned}$$

By Lemma 7.23,

$$| \mathbb {Z}^{m}/\Lambda _{q}^{\perp }(A) |=|\mathbb {Z}_{q}^{m}/\Lambda _{q}^{\perp }(A) /q\mathbb {Z}^{m}|=q^{n}.$$

By Lemma 7.24,

$$\begin{aligned} |\det (\Lambda _{q}^{\perp }(A))|=| \mathbb {Z}^{m}/\Lambda _{q}^{\perp }(A) |=q^{n}. \end{aligned}$$

So (7.47) holds. By Corollary 7.5 of the previous section, we have

$$\begin{aligned} |\det (\Lambda _{q}^{\perp }(A)^{*})|=q^{-n}. \end{aligned}$$

By Lemma 7.27,

$$\begin{aligned} |\det (\Lambda _{q}(A))|=q^{m}|\det (\Lambda _{q}^{\perp }(A)^{*})|=q^{m-n}. \end{aligned}$$

The Lemma holds.

7.4 Reduced Basis

In lattice theory, Reduced basis and corresponding LLL algorithm are the most important contents, which have an important impact on computational algebra, computational number theory and other neighborhoods, and are recognized as one of the most important computational methods in recent 100 years. In order to introduce Reduced basis and LLL algorithm, we recall the gram Schmidt orthogonalization process summarized by Eqs. (7.31)–(7.34). Let $\{\beta _{1},\beta _{2} ,\ldots ,\beta _{n}\}\subset \mathbb {R}^n$ be a set of bases corresponding to $\mathbb {R}^n$ , $\{\beta _{1}^{*},\beta _{2}^{*} ,\ldots ,\beta _{n}^{*}\}$ is the corresponding Gram–Schmidt orthogonal basis, where

$$\begin{aligned} \beta _{1}^{*}=\beta _{1}, ~\beta _{i}^{*}=\beta _{i}-\sum _{j=1}^{i-1}\frac{\langle \beta _{i},\beta _{j}^{*}\rangle }{\langle \beta _{j}^{*},\beta _{j}^{*}\rangle }\beta _{j}^{*}, ~1<i\le n. \end{aligned}$$

(7.49)

The above formula can be written as

$$\begin{aligned} \beta _{i}=\sum _{j=1}^{i}\frac{\langle \beta _{i}, \beta _{j}^{*}\rangle }{\langle \beta _{j}^{*}, \beta _{j}^{*}\rangle }\beta _{j}^{*}, ~1\le i\le n. \end{aligned}$$

(7.50)

There is

Lemma 7.29

Let $\{\beta _{1},\beta _{2} ,\ldots ,\beta _{n}\}$ be a set of bases of $\mathbb {R}^{n}$, $\{\beta _{1}^{*},\beta _{2}^{*} ,\ldots ,\beta _{n}^{*}\}$ is the corresponding Gram–Schmidt orthogonal basis, $L(\beta _{1},\beta _{2} ,\ldots ,\beta _{k})=\mathrm{Span}\{\beta _{1},\beta _{2} ,\ldots ,\beta _{k}\}$ is a linear subspace extended by $\beta _{1},\beta _{2} ,\ldots ,\beta _{k}$, then

(i)

$$\begin{aligned} L(\beta _{1},\beta _{2} ,\ldots ,\beta _{k})=L(\beta _{1}^{*},\beta _{2}^{*} ,\ldots ,\beta _{k}^{*}), ~1\le k \le n. \end{aligned}$$

(7.51)
(ii)

For $1\le i \le n$, there is
$$\begin{aligned} {\left\{ \begin{array}{ll} \langle \beta _{i},\beta _{k}^{*}\rangle =0, &{}\mathrm{when}~k>i;\\ \langle \beta _{i},\beta _{k} \rangle = \langle \beta _{k}^{*},\beta _{k}^{*} \rangle , &{}\mathrm{when}~k=i. \end{array}\right. } \end{aligned}$$

(7.52)
(iii)

$\forall ~ x \in \mathbb {R}^{n}$, $ x=\sum _{i=1}^{n}x_{i}\beta _{i}^{*}$, then
$$\begin{aligned} x_{i}= \frac{\langle x,\beta _{i}^{*}\rangle }{\langle \beta _{i}^{*},\beta _{i}^{*}\rangle }, ~1\le i\le n. \end{aligned}$$

(7.53)

Proof

The above three properties can be derived directly from Eq. (7.49) or (7.50).

Let $U= (U_{ij})_{n\times n}$, where

$$\begin{aligned} U_{ij}=\frac{\langle \beta _{i},\beta _{j}^{*}\rangle }{\langle \beta _{j}^{*},\beta _{j}^{*}\rangle }, \Rightarrow U_{ij}=0,~ \text {when} ~j>i. ~U_{ii}=1. \end{aligned}$$

(7.54)

Therefore, U is the lower triangular matrix with element 1 on the diagonal, and

$$\begin{aligned} \left[ \begin{array}{cccc} \beta _{1} \\ \beta _{2} \\ \vdots \\ \beta _{n} \end{array}\right] =U \left[ \begin{array}{cccc} \beta _{1}^{*} \\ \beta _{2}^{*} \\ \vdots \\ \beta _{n}^{*} \end{array}\right] . \end{aligned}$$

(7.55)

U is called the coefficient matrix when $\{\beta _{1},\beta _{2} ,\ldots ,\beta _{n}\}$ is orthogonalized.

Let’s introduce the concept of orthogonal projection: suppose $V \subset \mathbb {R}^{k} \subset \mathbb {R}^{ n}(1\le k \le n)$, the orthogonal complement space $V^{\perp }$ of V in $\mathbb {R}^{k}$ is

$$\begin{aligned} V^{\perp }=\{x \in \mathbb {R}^{k}|\langle x,\alpha \rangle =0,\forall ~\alpha \in V\}. \end{aligned}$$

(7.56)

Because $\mathbb {R}^{k}=V\oplus V^{\perp }$, so $\forall ~x \in \mathbb {R}^{k}$, the only can be expressed as

$$\begin{aligned} x=\alpha + \beta , \text {where}~ \alpha \in V,\beta \in V^{\perp }. \end{aligned}$$

$\alpha $ is called the orthogonal projection of x on subspace V, obviously $|x|^{2}=|\alpha |^{2}+|\beta |^{2}$.

Lemma 7.30

Let $\{\beta _{1},\beta _{2} ,\ldots ,\beta _{n}\}$ be a set of bases of $\mathbb {R}^{n}$ and $\{\beta _{1}^{*},\beta _{2}^{*} ,\ldots ,\beta _{n}^{*}\}$ be the corresponding orthogonal basis, $1\le k \le n$, then $\beta _{k}^{*}$ is the orthogonal projection of $\beta _{k}$ on the orthogonal complement space V of the subspace $L(\beta _{1},\beta _{2},\ldots , \beta _{k-1})$ of $L(\beta _{1},\beta _{2} ,\ldots ,\beta _{k})$.

Proof

When $k=1$, the proposition is trivial, if $k>1$, then by Lemma 7.29,

$$\begin{aligned} L(\beta _{1},\beta _{2} ,\ldots ,\beta _{k-1})=L(\beta _{1}^{*},\beta _{2}^{*} ,\ldots ,\beta _{k-1}^{*}). \end{aligned}$$

Therefore, the orthogonal complement space $V=L(\beta _{k}^{*})$ of $L(\beta _{1},\beta _{2} ,\ldots ,\beta _{k-1})$ in $L(\beta _{1},\beta _{2} ,\ldots ,\beta _{k-1}, \beta _{k})$ is a one-dimensional space, because of

$$\begin{aligned} \beta _{k}=\beta _{k}^{*}+\sum _{j=1}^{k-1}u_{kj}\beta _{j}^{*}, \end{aligned}$$

and

$$\begin{aligned} \left\langle \beta _{k}^{*},\sum _{j=1}^{k-1}u_{kj}\beta _{j}^{*}\right\rangle =0. \end{aligned}$$

So $\beta _{k}^{*}$ is the orthogonal projection of $\beta _{k}$ on V. The Lemma holds.

Next, we discuss the transformation law of the corresponding orthogonal basis when making the elementary column transformation of the base matrix $[\beta _{1},\beta _{2} ,\ldots ,\beta _{n}]$.

Lemma 7.31

Let $\{\beta _{1},\beta _{2} ,\ldots , \beta _{n}\}\subset \mathbb {R}^{n}$ is a set of bases, $\{\beta _{1}^{*},\beta _{2}^{*} ,\ldots ,\beta _{n}^{*}\}$ is the corresponding orthogonal basis, $A=(u_{ij})_{n\times n}$ is the coefficient matrix. Exchange $\beta _{k-1}$ with $\beta _{k}$ to get a set of bases $\{\alpha _{1},\alpha _{2} ,\ldots , \alpha _{n}\}$ of $ \mathbb {R}^{n}$, where

$$\begin{aligned} \alpha _{k-1}=\beta _{k},\alpha _{k}=\beta _{k-1},\alpha _{i}=\beta _{i},\mathrm{when}~i\ne k-1, k. \end{aligned}$$

Let $\{\alpha _{1}^{*},\alpha _{2}^{*} ,\ldots ,\alpha _{n}^{*}\}$ be the corresponding orthogonal basis and $A_{1}=(v_{ij})_{n\times n}$ be the corresponding coefficient matrix, then we have

(i)

$\alpha _{i}^{*}=\beta _{i}^{*}$, if $i\ne k-1, k.$
(ii)

$$\begin{aligned} {\left\{ \begin{array}{ll} \alpha _{k-1}^{*}=\beta _{k}^{*}+u_{kk-1}\beta _{k-1}^{*}\\ \alpha _{k}^{*}=\beta _{k-1}^{*}-v_{kk-1}\beta _{k-1}^{*}.\\ \end{array}\right. } \end{aligned}$$
(iii)

$v_{ij}=u_{ij}$, if $1\le j <i \le n$, and $\{i, j\}~\bigcap ~\{k,k-1\}=\varnothing $.
(iv)

$$\begin{aligned} {\left\{ \begin{array}{ll} v_{ik-1}=u_{ik-1}v_{kk-1}+u_{ik}\frac{|\beta _{k}^{*}|^{2}}{|\alpha _{k-1}^{*}|^{2}}, &{}i> k.\\ v_{ik}=u_{ik-1}-u_{ik}u_{kk-1}, &{}i > k.\\ \end{array}\right. } \end{aligned}$$
(v)

$v_{k-1 j}=u_{kj}$,$v_{k j}=u_{k-1 j}, ~ 1\le j <k-1$.

Proof

If $1\le i <k-1$, or $k<i\le n$, then the orthogonal complement space in $L(\alpha _{1},\alpha _{2} ,\ldots ,\alpha _{i})=L(\beta _{1},\beta _{2} ,\cdots ,\beta _{i})$,

$$\begin{aligned} V=L^{\perp }(\alpha _{1},\alpha _{2} ,\ldots ,\alpha _{i-1})=L^{\perp }(\beta _{1},\beta _{2} ,\ldots ,\beta _{i-1}). \end{aligned}$$

Therefore, the orthogonal projection of $\alpha _{i}^{*}$ as $\alpha _{i}=\beta _{i}$ on V is the same as that of $\beta _{i}^{*}$ as $\beta _{i}$ on V, that is $\alpha _{i}^{*}=\beta _{i}^{*}(i\ne k-1,k)$, (i) holds.

To prove (ii), because $\alpha _{k-1}^{*}$ is the orthogonal projection of $\beta _{k}(=\alpha _{k-1})$ on the orthogonal complement space $L(\beta _{k-1}^{*})$ of $L(\beta _{1},\beta _{2} ,\ldots ,\beta _{k-2})$, because of

$$\begin{aligned} \begin{aligned} \beta _{k}^{*}&=\beta _{k}-\sum _{j=1}^{k-1}u_{kj}\beta _{j}^{*}\\&=\beta _{k}-u_{kk-1}\beta _{k-1}^{*}-\sum _{j=1}^{k-2}u_{kj}\beta _{j}^{*}, \end{aligned} \end{aligned}$$

and $L(\beta _{1},\beta _{2} ,\ldots ,\beta _{k-2})=L(\beta _{1}^{*},\beta _{2} ^{*},\ldots ,\beta _{k-2})^{*}$, there is

$$\begin{aligned} \alpha _{k-1}^{*}=\beta _{k}^{*}+u_{kk-1}\beta _{k-1}^{*}. \end{aligned}$$

Similarly, $\alpha _{k}^{*}$ is the orthogonal projection of $\beta _{k-1}^{*}$ on $L(\alpha _{k-1}^{*})$, thus

$$\begin{aligned} \alpha _{k}^{*}=\beta _{k-1}^{*}-v_{kk-1}\alpha _{k-1}^{*}. \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} v_{kk-1}&=\frac{\langle \beta _{k-1}^{*},\alpha _{k-1}^{*}\rangle }{|\alpha _{k-1}^{*}|^{2}}\\&=\frac{\langle \beta _{k-1}^{*},u_{kk-1}\beta _{k-1}^{*}\rangle }{|\alpha _{k-1}^{*}|^{2}}\\&=u_{kk-1}\frac{|\beta _{k-1}^{*}|^{2}}{|\alpha _{k-1}^{*}|^{2}}, \end{aligned} \end{aligned}$$

thus (ii) holds. Similarly, other properties can be proved. Lemma 7.31 holds.

Lemma 7.32

Let $\{\beta _{1},\beta _{2} ,\ldots ,\beta _{n}\}$ be a set of bases of $\mathbb {R}^{n}$, $\{\beta _{1}^{*},\beta _{2}^{*} ,\ldots ,\beta _{n}^{*}\}$ be the corresponding orthogonal basis, and $A=(u_{ij})_{n\times n}$ be the coefficient matrix. For any $k\ge 2$, if we replace $\beta _{k}$ with $\beta _{k}-r\beta _{k-1}$ and keep the other $\beta _{i}$ unchanged $(i \ne k)$, we get a new set of bases.

$$\begin{aligned} \{\alpha _{1},\alpha _{2} ,\ldots ,\alpha _{n}\}=\{\beta _{1},\beta _{2} ,\ldots ,\beta _{k-1},\beta _{k}-r\beta _{k-1},\beta _{k+1},\ldots ,\beta _{n}\}. \end{aligned}$$

Let $\{\alpha _{1}^{*},\alpha _{2}^{*} ,\ldots ,\alpha _{n}^{*}\}$ be the corresponding orthogonal basis and $A_{1}=(v_{ij})_{n\times n}$ the corresponding coefficient matrix, then we have

(i)

$\alpha _{i}^{*}=\beta _{i}^{*}$, $\forall ~ 1\le i\le n$, that is, $\beta _{i}^{*}$ remains unchanged.
(ii)

$v_{ij}=u_{ij}$, if $ 1\le j<i\le n$, $i\ne k.$
(iii)

$$\begin{aligned} {\left\{ \begin{array}{ll} v_{kj}=u_{kj} -ru_{k-1,j}, &{}\mathrm{if}~ j<k-1\\ v_{kk-1}=u_{kk-1} -r, &{}\mathrm{if}~j=k-1. \end{array}\right. } \end{aligned}$$

Proof

When $i<k$, or $i>k$, $\alpha _{i}^{*}=\beta _{i}^{*}$ is trivial, to prove (i), only prove when $i=k$. Because $\alpha _{k}^{*}$ is the orthogonal projection of $\alpha _{k}=\beta _{k}-r\beta _{k-1}$ in the orthogonal complement space $L(\alpha _{k}^{*})=L(\beta _{k}^{*})$ of $L(\beta _{1},\beta _{2} ,\ldots ,\beta _{k-1})=L(\alpha _{1},\alpha _{2} ,\ldots ,\alpha _{k-1})$,

$$\begin{aligned} \begin{aligned} \beta _{k}^{*}&=\beta _{k}-\sum _{j=1}^{k-1}u_{kj}\beta _{j}^{*}\\&=\beta _{k}-r\beta _{k-1}-\left( \sum _{j=1}^{k-2}u_{kj}\beta _{j}^{*}+(u_{kk-1}-r)\beta _{k-1}^{*}\right) \\&=\alpha _{k}-\left( \sum _{j=1}^{k-2}u_{kj}\beta _{j}^{*}+(u_{kk-1}-r)\beta _{k-1}^{*}\right) . \end{aligned} \end{aligned}$$

This proves that $\alpha _{k}^{*}=\beta _{k}^{*}$. Thus (i) holds. To prove (ii), when $i \ne k$, we have

$$\begin{aligned} v_{ij}=\frac{\langle \alpha _{i},\alpha _{j}^{*}\rangle }{|\alpha _{j}^{*}|^{2}} =\frac{\langle \beta _{i},\beta _{j}^{*}\rangle }{|\beta _{j}^{*}|^{2}}=u_{ij}, \end{aligned}$$

that is (ii) holds. When $i = k$,

$$\begin{aligned} \begin{aligned} v_{kj}&=\frac{\langle \alpha _{k},\alpha _{j}^{*}\rangle }{|\alpha _{j}^{*}|^{2}}\\&=\frac{\langle \beta _{k}-r\beta _{k-1},\beta _{j}^{*}\rangle }{|\alpha _{j}^{*}|^{2}}(1 \le j <k \le n)\\&=\frac{\langle \beta _{k},\beta _{j}^{*}\rangle }{|\beta _{j}^{*}|^{2}}-r\frac{\langle \beta _{k-1},\beta _{j}^{*}\rangle }{|\beta _{j}^{*}|^{2}}\\&=u_{kj}-ru_{k-1j}.\\ \end{aligned} \end{aligned}$$

The above formula holds for all $1 \le j \le k-1$, thus (iii) holds, the Lemma holds.

Next, we introduce the concept of a set of Reduced bases of $\mathbb {R}^{n}$.

Definition 7.10

Let $\{\beta _{1},\beta _{2} ,\ldots ,\beta _{n}\} \subset \mathbb {R}$ be a set of bases, $\{\beta _{1}^{*},\beta _{2}^{*}, \ldots , \beta _{n}^{*}\}$ be the corresponding orthogonal basis, $A=(u_{ij})_{n\times n}$ be the coefficient matrix, and $\{\beta _{1},\beta _{2} ,\ldots , \beta _{n-1}\}$ be a set of Reduced bases of $\mathbb {R}^{n}$, if

$$\begin{aligned} {\left\{ \begin{array}{ll} &{}(\mathrm{i}) ~|u_{ij}|\le \frac{1}{2}, \forall ~1 \le j< i\le n.\\ &{}(\mathrm{ii}) ~|\beta _{i}^{*}-u_{kk-1}\beta _{i-1}^{*}|^{2}\ge \frac{3}{4}|\beta _{i-1}^{*}|^{2}, \forall ~1 < i\le n. \end{array}\right. } \end{aligned}$$

(7.57)

A set of Reduced bases of $\mathbb {R}^{n}$ is sometimes called Lovisz Reduced bases, which is of great significance in lattice theory. The important result of this section is that any lattice L in $\mathbb {R}^{n}$ has Reduced bases, and the method to calculate the Reduced bases is the famous LLL algorithm.

Theorem 7.3

Let $L \subset \mathbb {R}^{n}$ be a lattice(full rank lattice), then there is a generating matrix $B=[\beta _{1},\beta _{2},\ldots , \beta _{n}]$ of L, where $\{\beta _{1},\beta _{2},\ldots , \beta _{n} \}$ is a Reduced basis of $\mathbb {R}^{n}$ and will also be a Reduced basis of lattice $L=L(B)$.

Proof

Let $B=[\beta _{1},\beta _{2},\ldots , \beta _{n} ]$, $L=L(B)$, first we prove

$$\begin{aligned} |u_{kk-1}|\le \frac{1}{2}, \forall ~1 \le k. \end{aligned}$$

(7.58)

If there is a $k>1$, then the above formula does not hold, let r be the nearest integer of $u_{kk-1}$, obviously,

$$\begin{aligned} |u_{kk-1}-r|\le \frac{1}{2}. \end{aligned}$$

In $\{\beta _{1},\beta _{2},\ldots , \beta _{n} \}$, replace $\beta _{k}$ with $\beta _{k}-r\beta _{k-1}$, thus by Lemma 7.32,

$$\begin{aligned} u_{kj}\rightarrow u_{kj}-ru_{k-1j}, ~1\le j \le k. \end{aligned}$$

Specially, when $j=k-1$,

$$\begin{aligned} u_{kk-1}\rightarrow u_{kk-1}-r, \end{aligned}$$

under the new basis, all $\beta _{i}$ and $u_{ij}(1 \le j<i \ne k)$ remain unchanged, so Eq. (7.58) holds under the new basis.

In the second step of LLL algorithm, we prove that

$$\begin{aligned} |\beta _{k}^{*}-u_{kk-1}\beta _{k-1}^{*}|^{2}\ge \frac{3}{4}|\beta _{k-1}^{*}|^{2},\forall ~ 1 < k\le n. \end{aligned}$$

(7.59)

By (7.4),

$$\begin{aligned} |\beta _{k}^{*}+u_{kk-1}\beta _{k-1}^{*}|^{2}=|\beta _{k}^{*}-u_{kk-1}\beta _{k-1}^{*}|^{2}. \end{aligned}$$

Therefore, the sign in the absolute value on the right of Eq. (7.59) can be changed arbitrarily. If there is a k, $1 < k\le n$ such that (7.59) does not hold, that is

$$\begin{aligned} |\beta _{k}^{*}+u_{kk-1}\beta _{k-1}^{*}|^{2}< \frac{3}{4}|\beta _{k-1}^{*}|^{2}. \end{aligned}$$

(7.60)

In this case, if $\beta _{k}$ and $\beta _{k-1}$ are exchanged and the other $\beta _{i}$ remains unchanged, there is a new set of bases $\{\alpha _{1},\alpha _{2} ,\ldots ,\alpha _{n}\}$, the corresponding orthogonal basis $\{\alpha _{1}^{*},\alpha _{2}^{*} ,\ldots ,\alpha _{n}^{*}\}$ and the coefficient matrix $A_{1}=(v_{ij})_{n\times n}$, where

$$\begin{aligned} \alpha _{i}=\beta _{i}(i \ne k-1,k), ~\alpha _{k-1}=\beta _{k}, ~\alpha _{k}=\beta _{k-1}. \end{aligned}$$

Let’s prove that under the new base $\{\alpha _{1},\alpha _{2} ,\ldots ,\alpha _{n}\}$, there is

$$\begin{aligned} |\alpha _{k}^{*}+v_{kk-1}\alpha _{k-1}^{*}|^{2} \ge \frac{3}{4}|\alpha _{k-1}^{*}|^{2}, \end{aligned}$$

(7.61)

by Lemma 7.31,

$$\begin{aligned} {\left\{ \begin{array}{ll} \alpha _{k-1}^{*}=\beta _{k}^{*}+u_{kk-1}\beta _{k-1}^{*}\\ \alpha _{k}^{*}=\beta _{k-1}^{*}-v_{kk-1}\beta _{k-1}^{*}.\\ \end{array}\right. } \end{aligned}$$

By (7.60), we have

$$\begin{aligned} |\alpha _{k-1}^{*}|^{2} < \frac{3}{4}|\alpha _{k}^{*}+v_{kk-1}\alpha _{k-1}^{*}|^{2}. \end{aligned}$$

That is

$$\begin{aligned} |\alpha _{k}^{*}+v_{kk-1}\alpha _{k-1}^{*}|^{2}>\frac{4}{3}|\alpha _{k-1}^{*}|^{2}>\frac{3}{4}|\alpha _{k-1}^{*}|^{2}. \end{aligned}$$

Thus (7.61) holds. Using the above method continuously, it can be proved that formula (7.59) is valid for $\forall ~ k>1$, however, when k is replaced by $k-1$, the new $\beta _{k-1}^{*}$ is replaced by

$$\begin{aligned} \beta _{k-1}^{*} \rightarrow \beta _{k-1}^{*}+u_{k-1k-2}\beta _{k-2}^{*}=\overline{\beta _{k-1}^{*}}. \end{aligned}$$

We have to prove (7.59), it remains unchanged when $k-1$ is used instead of k. In fact,

$$\begin{aligned} \begin{aligned} |\beta _{k}^{*}+u_{kk-1}\overline{\beta _{k-1}^{*}}|^{2}&=|\beta _{k}^{*}+u_{kk-1}(\beta _{k-1}^{*}+u_{k-1k-2}\beta _{k-2}^{*})|^{2}\\&=|\beta _{k}^{*}+u_{kk-1}\beta _{k-1}^{*}|^{2}+|u_{kk-1}u_{k-1k-2}\beta _{k-2}^{*}|^{2}\\&\ge \frac{3}{4}(|\beta _{k-1}^{*}|^{2}+u_{kk-1}^{2}|u_{k-1k-2}\beta _{k-2}^{*}|^{2})\\&\ge \frac{3}{4}(|\beta _{k-1}^{*}|^{2}+|u_{k-1k-2}\beta _{k-2}^{*}|^{2})\\&=\frac{3}{4}|\beta _{k-1}^{*}+u_{k-1k-2}\beta _{k-2}^{*}|^{2}\\&=\frac{3}{4}|\overline{\beta _{k-1}^{*}}|^{2}. \end{aligned} \end{aligned}$$

Therefore, Eq. (7.59) does not change when the transformation of commutative vector is carried out continuously; that is, Eq. (7.59) holds for all k, $1<k\le n$.

The third step of the LLL algorithm, let’s prove that

$$\begin{aligned} |u_{kj}|\le \frac{1}{2}, \forall ~1\le j<k\le n. \end{aligned}$$

(7.62)

When $j=k-1$, (7.58) is the (7.62). For given k, $1<k\le n$, if (7.62) does not hold, let l be the largest subscript $\Rightarrow |u_{kl}|> \frac{1}{2}$. Let r be the nearest integer to $u_{kl}$, then $|u_{kl}-r|\le \frac{1}{2}$. Replace $\beta _{k}$ with $\beta _{k}-r\beta _{l}$, from Lemma 7.32, all $\beta _{i}^{*}$ remain unchanged and the coefficient matrix is changed to:

$$\begin{aligned} {\left\{ \begin{array}{ll} &{}u_{kj}=u_{kj} -ru_{lj},1\le j<l\\ &{}u_{kl}=u_{kl} -r. \end{array}\right. } \end{aligned}$$

While the other $u_{ij}$ remains unchanged, at this time,

$$\begin{aligned} |u_{kl}-r|=|v_{kl}|\le \frac{1}{2}. \end{aligned}$$

So we have Eq. (7.62) for all $1\le j<k\le n$.

The above matrix transformation is equivalent to multiplying a unimodular matrix from the right, so the Reduced basis $B\Rightarrow L=L(B)$ of lattice L is finally obtained. We complete the proof of Theorem 7.3.

Lemma 7.33

Let $L=L(B)$ be a lattice, B is a Reduced basis of L, and $B^{*}=[\beta _{1}^{*},\beta _{2}^{*} ,\ldots ,\beta _{n}^{*}]$ is the corresponding orthogonal basis, then for any $1\le j<i\le n$, we have

$$\begin{aligned} |\beta _{j}^{*}|^2 \le 2^{i-j}|\beta _{i}^{*}|^2. \end{aligned}$$

Proof

Because $B=[\beta _{1},\beta _{2}, \ldots ,\beta _{n}]$ is a Reduced basis, then

$$\begin{aligned} |\beta _{k}^{*}+u_{kk-1}\beta _{k-1}^{*}|^{2}\ge \frac{3}{4}|\beta _{k-1}^{*}|^{2}. \end{aligned}$$

Thus

$$\begin{aligned} |\beta _{k}^{*}+u_{kk-1}\beta _{k-1}^{*}|^{2}=|\beta _{k}^{*}|^{2}+u_{kk-1}^{2}|\beta _{k-1}^{*}|^{2}\ge \frac{3}{4}|\beta _{k-1}^{*}|^{2}. \end{aligned}$$

There is

$$\begin{aligned} \begin{aligned} |\beta _{k}^{*}|^{2}&=\frac{3}{4}|\beta _{k-1}^{*}|^{2}-u_{kk-1}^{2}|\beta _{k-1}^{*}|^{2}\\&\ge \frac{3}{4}|\beta _{k-1}^{*}|^{2}-\frac{1}{4}|\beta _{k-1}^{*}|^{2}\\&= \frac{1}{2}|\beta _{k-1}^{*}|^{2}. \end{aligned} \end{aligned}$$

So when $1\le j<i\le n$ given, we have

$$\begin{aligned} \begin{aligned} |\beta _{i}^{*}|^{2}&\ge \frac{1}{2}|\beta _{i-1}^{*}|^{2}\\&\ge \frac{1}{4}|\beta _{i-2}^{*}|^{2}\\&\ge \cdots \\&\ge 2^{-(i-j)}|\beta _{j}^{*}|^{2}, \end{aligned} \end{aligned}$$

thus

$$\begin{aligned} |\beta _{j}^{*}|^{2}\le 2^{i-j}|\beta _{i}^{*}|^2. \end{aligned}$$

Remark 7.3

In the definition of Reduced base, the coefficient $\frac{3}{4}$ on the left of the second inequality of (7.57) can be replaced by any $\delta $, where $\frac{1}{4}<\delta <1$. Specially, Babai pointed out in (1986) that the second inequality of Eq. (7.57) can be replaced by the following weaker inequality,

$$\begin{aligned} |\beta _{i}^{*}|\le \frac{1}{2} |\beta _{i-1}^{*}|. \end{aligned}$$

(7.63)

Let’s discuss the computational complexity of the LLL algorithm. Let $B=\{\beta _{1},\beta _{2},\ldots , \beta _{n}\}$ be any set of bases, for any $0\le k\le n$, we define

$$\begin{aligned} d_{0}=1, d_{k}=\det (\langle \beta _{i},\beta _{j}\rangle _{k \times k}). \end{aligned}$$

(7.64)

If $\{\beta _{1}^{*},\beta _{2}^{*},\ldots , \beta _{n}^{*} \}$ is the orthogonal basis corresponding to $\{\beta _{1},\beta _{2},\ldots , \beta _{n} \}$, there is obviously

$$\begin{aligned} d_{k}=\prod _{i=1}^{k}|\beta _{i}^{*}|^{2}, 0< k\le n. \end{aligned}$$

(7.65)

Thus, $d_{i}$ is a positive number, and $d_{n}=d(L)^{2}$. Let

$$\begin{aligned} D=\prod _{k=1}^{n-1}d_{k}, \end{aligned}$$

(7.66)

We first prove that $d_{k} (0< k\le n)$ and D have lower bounds.

Lemma 7.34

Let

$$\begin{aligned} m(L)=\lambda (L)^{2}=\min \{|x|^{2}:x\in L,x\ne 0\}. \end{aligned}$$

Then

$$\begin{aligned} d_{k}\ge \left( \frac{3}{4}\right) ^{\frac{k(k-1)}{2}}m(L)^{k}, 1\le k\le n. \end{aligned}$$

Proof

The determinant of k-dimensional lattice $L_{k}=L(\beta _{1},\beta _{2},\ldots , \beta _{k})\subset \mathbb {R}^{k} (1\le k\le n)$ has

$$\begin{aligned} d^{2}(L_{k})=d_{k}. \end{aligned}$$

By the conclusion of Cassels (1971), there is a nonzero lattice point x in $L_{k}$, which satisfies $x\in L_{k}, x \ne 0$, and

$$\begin{aligned} |x|^{2}\le \left( \frac{4}{3}\right) ^{\frac{k-1}{2}}d_{k}^{\frac{1}{k}}. \end{aligned}$$

(7.67)

Then

$$\begin{aligned} \begin{aligned} d_{k}&\ge \left( \frac{3}{4}\right) ^{\frac{k(k-1)}{2}}m(L_{k})^{k}\\&\ge \left( \frac{3}{4}\right) ^{\frac{k(k-1)}{2}}(m(L))^{k}. \end{aligned} \end{aligned}$$

The Lemma holds.

Another important conclusion of this section is that for the integer lattice L estimation, the computational complexity of the Reduced basis of the integer lattice is obtained by using the LLL algorithm. We prove that the LLL algorithm on the integer lattice is polynomial.

Theorem 7.4

Let $L=L(B) \subset \mathbb {Z}^{n}$ be an integer lattice, $B=[\beta _{1},\beta _{2},\ldots , \beta _{n} ]$ is the generating matrix, suppose N satisfies

$$\begin{aligned} \max _{1\le i\le n}|\beta _{i}|^{2}\le N. \end{aligned}$$

Then the computational complexity of the Reduced basis of L obtained by B using the LLL algorithm is

$$\begin{aligned} \mathrm{Time}(\mathrm{LLL} \; \mathrm{algorithm})=O(n^{4}\log N). \end{aligned}$$

The binary digits of all integers in the LLL algorithm are $O(n\log N)$, so the computational complexity of the LLL algorithm on the integer lattice is polynomial.

Proof

By (7.36), we have

$$|\beta _{i}^{*}|\le |\beta _{i}|, 1 \le i \le n.$$

where $\{\beta _{1}^{*},\beta _{2}^{*},\ldots , \beta _{n}^{*} \}$ is the orthogonal basis corresponding to $\{\beta _{1},\beta _{2},\ldots , \beta _{n} \}$, then by (7.65) and (7.66), we have

$$\begin{aligned} d_{k}=\prod _{i=1}^{k}|\beta _{i}^{*}|^{2}\le \prod _{i=1}^{k}|\beta _{i}|^{2}\le N^{k},1\le k\le n. \end{aligned}$$

And

$$\begin{aligned} 1\le D \le N^{\frac{n(n-1)}{2}}. \end{aligned}$$

(7.68)

The inequality on the left of the above formula is because of $d_{k} \in \mathbb {Z}$, and $d_{k}\ge 1$, by (7.66), then $D \ge 1$. Therefore, O(n) arithmetic operations are required in the first step of the LLL algorithm, $O(n^{3})$ arithmetic operations are required in the second and third steps, and the number of bit operations per algorithm operation is $\le $ Time (calculate D), thus

$$\begin{aligned} \mathrm{Time} (\mathrm{LLL} \; \mathrm{algorithm})\le O(n^{3})\mathrm{Time} (\mathrm{calculate}\;D)=O(n^{4}\log N). \end{aligned}$$

Therefore, the first conclusion of Theorem 7.4 is proved. The second conclusion is more complex, we will omit it. Interested readers can refer to the original (1982) of A. K. Lenstra, H. W. Lenstra and L. Lovasz.

7.5 Approximation of SVP and CVP

The most important application of lattice Reduced basis and LLL algorithm is to provide approximation algorithms for the shortest vector problem and the shortest adjacent vector problem, and obtain some approximate results. Firstly, we prove the following Lemma.

Lemma 7.35

Let $\{\beta _{1},\beta _{2},\ldots , \beta _{n} \}$ be a Reduced basis of a lattice L, $\{\beta _{1}^{*},\beta _{2}^{*},\ldots , \beta _{n}^{*} \}$ be the corresponding orthogonal basis, and d(L) be the determinant of L, then we have

(i)

$$\begin{aligned} d(L) \le \prod _{i=1}^{n}|\beta _{i}| \le 2^{\frac{n(n-1)}{4}}d(L). \end{aligned}$$

(7.69)
(ii)

$$\begin{aligned} |\beta _1|\le 2^{\frac{n-1}{4}}d(L)^{\frac{1}{n}}. \end{aligned}$$

(7.70)

Proof

The inequality on the left of (i), called Hadamard inequality, has been given by Lemma 7.17. The inequality on the right of (i) gives an upper bound of $\prod _{i=1}^{n}|\beta _{i}|$, by Lemma 7.33,

$$\begin{aligned} |\beta _{j}^{*}|\le 2^{\frac{i-j}{2}}|\beta _{i}^{*}|, 1 \le j < i \le n. \end{aligned}$$

(7.71)

Thus

$$\begin{aligned} \beta _{i}=\beta _{i}^{*}+\sum _{j=1}^{i-1}u_{ij}\beta _{j}^{*}. \end{aligned}$$

We get

$$\begin{aligned} |\beta _{i}|^{2}&=|\beta _{i}^{*}|^{2}+\sum _{j=1}^{i-1}u_{ij}^{2}|\beta _{j}^{*}|^{2}\nonumber \\&\le |\beta _{i}^{*}|^{2}+\frac{1}{4}\sum _{j=1}^{i-1} |\beta _{j}^{*}|^{2}\nonumber \\&\le \left( 1+\frac{1}{4}\sum _{j=1}^{i-1}2^{i-j}\right) |\beta _{i}^{*}|^{2}\\&= \left( 1+\frac{1}{4}(2^{i}-2)\right) |\beta _{i}^{*}|^{2}\nonumber \\&\le 2^{i-1} |\beta _{i}^{*}|^{2}.\nonumber \end{aligned}$$

(7.72)

There is

$$\begin{aligned} \begin{aligned} \prod _{i=1}^{n}|\beta _{i }|^{2}&\le \prod _{i=1}^{n}2^{i-1} |\beta _{i}^{*}|^{2}\\&= 2^{\sum _{i=0}^{n-1}i}\prod _{i=1}^{n}|\beta _{i}^{*}|^{2}\\&= 2^{\frac{n}{2}(n-1)}\prod _{i=1}^{n}|\beta _{i}^{*}|^{2}\\&= 2^{\frac{n}{2}(n-1)}(d(L))^{2}. \end{aligned} \end{aligned}$$

$$\begin{aligned} \prod _{i=1}^{n}|\beta _{i}|\le 2^{\frac{n}{4}(n-1)}d(L). \end{aligned}$$

We have (7.69) holds. To prove (iii), by (7.72) and (7.71), then

$$\begin{aligned} |\beta _{j}|^{2}\le 2^{j-1}|\beta _{j}^{*}|^{2}\le 2^{j-1}2^{i-j}|\beta _{i}^{*}|^{2}=2^{i-1}|\beta _{i}^{*}|^{2}. \end{aligned}$$

(7.73)

For all $1 \le j \le i \le n $, especially,

$$\begin{aligned} |\beta _{1}^{*}|\le 2^{i-1}|\beta _{i}^{*}|^{2},1 \le i \le n . \end{aligned}$$

Thus

$$\begin{aligned} \begin{aligned} |\beta _{1}|^{2n}&\le 2^{\sum _{i=0}^{n}(i-1)} \prod _{i=1}^{n}|\beta _{i}^{*}|^{2}\\&= 2^{\frac{n}{2}(n-1)}(d(L))^{2}. \end{aligned} \end{aligned}$$

$$\begin{aligned} |\beta _{1}|\le 2^{\frac{n-1}{4}}d(L)^{ \frac{1}{n}} . \end{aligned}$$

Lemma 7.35 holds!

The following theorem shows that if $\{\beta _{1},\beta _{2},\ldots , \beta _{n} \}$ is a set of Reduced bases of a lattice L, then $\beta _{1}$ is the approximation vector of the shortest vector $u_{0}$ of lattice L, and the approximation coefficient $r_n=2^{n-1}$.

Theorem 7.5

Let $L=L(B) \subset \mathbb {R}^{n}$ be a lattice (full rank lattice), $B=[\beta _{1},\beta _{2},\ldots , \beta _{n}]$ is a set of Reduced bases of L, $\lambda _{1}=\lambda (L)$ is the minimal distance of L, then

$$\begin{aligned} |\beta _{1}|\le 2^{\frac{n-1}{2}} \lambda _{1}=2^{\frac{n-1}{2}}\lambda (L). \end{aligned}$$

(7.74)

Proof

We only prove that for $\forall ~ x \in L$, $x \ne 0$, there is

$$\begin{aligned} |\beta _{1}|^2 \le 2^{n-1} |x|^{2},\forall ~ x \in L, x \ne 0. \end{aligned}$$

(7.75)

When $ x \in L$, $x \ne 0$ given, let

$$\begin{aligned} x=\sum _{i=1}^{n}r_{i}\beta _{i} =\sum _{i=1}^{n}r_{i}^{'}\beta _{i}^{*}, r_{i} \in \mathbb {Z}, r_{i}^{'} \in \mathbb {R},1 \le i\le n. \end{aligned}$$

Let k be the largest subscript $\Rightarrow r_{k} \ne 0$, thus $r_{k}=r_{k}^{'}$. So

$$\begin{aligned} |x|^{2}\ge r_{k}^{2}|\beta _{k}^{*}|^2\ge |\beta _{k}^{*}|^{2} \ge 2^{1-k}|\beta _{1}|^2. \end{aligned}$$

(7.76)

Thus

$$\begin{aligned} |\beta _{1}|^{2}\le 2^{k-1} |x|^{2}\le 2^{n-1}|x|^{2}, x \in L,x \ne 0. \end{aligned}$$

That is (7.75) holds, thus Theorem 7.5 holds.

The following results show that not only the shortest vector, the whole Reduced basis vector is the approximation vector of the Successive Shortest vector of the lattice.

Lemma 7.36

Let $L \subset \mathbb {R}^{n}$ be a lattice, $\{\beta _{1},\beta _{2},\ldots , \beta _{n} \}$ is a Reduced base of L, let $\{x_{1},x_{2},\ldots , x_{t} \}\subset L$ be t linearly independent lattice points, then

$$\begin{aligned} |\beta _{j}|^{2} \le 2^{n-1} \max \{ |x_{1}|^{2},|x_{2}|^{2},\ldots , |x_{t}|^{2} \}. \end{aligned}$$

(7.77)

For all $1\le j\le t$ holds.

Proof

Write

$$\begin{aligned} x_{j}=\sum _{i=1}^{n}r_{ij}\beta _{i}, r_{ij} \in \mathbb {Z}, 1 \le i\le n, 1 \le j\le t. \end{aligned}$$

For fixed j, let i(j) the largest positive integer $i \Rightarrow r_{ij} \ne 0$, by (7.76), we have

$$\begin{aligned} |x_{j}|^{2} \ge |\beta _{i(j)}^{*}|^{2}, ~1 \le j\le t. \end{aligned}$$

Change the order of $ x_{j}$ to ensure $i(1) \le i(2) \le \cdots \le i(t) $, then $j \le i(j)$, for $\forall ~ 1 \le j\le t $ holds. Otherwise, the assumption that

$$\begin{aligned} \{x_{1},x_{2},\ldots , x_{n} \} \subset L(\beta _{1},\beta _{2},\ldots , \beta _{j-1}) \end{aligned}$$

is linearly independent of $ x_{1},x_{2},\ldots , x_{j}$ is contradictory. Thus $j \le i(j)$. By (7.73) of Lemma 7.35, then

$$\begin{aligned} \begin{aligned} |\beta _{j}|^{2}&\le 2^{i(j)-1}|\beta _{i(j)}^{*}|^{2}\\&\le 2^{n-1}{|\beta _{i(j)}^{*}|}^{2}\\&\le 2^{n-1}|x_{j}|^{2},\forall ~ 1 \le j\le t. \end{aligned} \end{aligned}$$

Thus (7.77) holds, the Lemma holds.

Remark 7.4

We give a proof of $r_{k}=r_{k}^{'}$ in Theorem 7.5, because k is the largest subscript $\Rightarrow r_{k} \ne 0$, so

$$\begin{aligned} x=\sum _{i=1}^{k}r_{i}\beta _{i}=\sum _{i=1}^{k}r_{i}^{'}\beta _{i}^{*}. \end{aligned}$$

By (7.52) and (7.53),

$$\begin{aligned} r_{k}^{'}=\frac{\langle x,\beta _{k}^{*} \rangle }{|\beta _{k}^{*}|^{2}}, r_{k}=\frac{\langle x,\beta _{k}^{*}\rangle }{\langle \beta _{k},\beta _{k}^{*}\rangle }. \end{aligned}$$

Because $\langle \beta _{k},\beta _{k}^{*}\rangle =\langle \beta _{k}^{*},\beta _{k}^{*}\rangle $, so

$$\begin{aligned} r_{k}^{'}=\frac{\langle x,\beta _{k}^{*}\rangle }{\langle \beta _{k},\beta _{k}^{*}\rangle }=r_k. \end{aligned}$$

In order to discuss the approximation of the Successive Shortest vector of a lattice, let’s look at the definitions of the continuous minimum $\lambda _{1},\lambda _{2},\ldots ,\lambda _{n}$ and the Successive Shortest vector of a lattice, by Definition 7.6 and Corollary 7.6 in Sect. 7.2, the continuous minimum $\lambda _{1},\lambda _{2},\ldots ,\lambda _{n}$ of a full rank lattice is reachable, for all $1\le i\le n$, there is

$$\begin{aligned} |\alpha _{i}|=\lambda _{i}, ~\alpha _{i}\in L,1\le i\le n. \end{aligned}$$

For a Successive Shortest vector called $\alpha _{1},\alpha _{2},\ldots ,\alpha _{n}$, $|\alpha _{i}|$ is the shortest under the condition that $\alpha _{i}$ is linearly independent of $\{\alpha _{1}, \alpha _{2}, \ldots , \alpha _{i-1}\}$.

Theorem 7.6

Let $\{\beta _{1},\beta _{2},\ldots , \beta _{n} \}$ be a Reduced basis of lattice L, and $\lambda _{1},\lambda _{2},\ldots ,\lambda _{n}$ be the continuous minimum of L, then we have

$$\begin{aligned} |\beta _{i}|^{2} \le 2^{n-1}\lambda _{i}, 1\le i\le n. \end{aligned}$$

(7.78)

Proof

We make an induction of i. Because $\{\beta _{1},\beta _{2},\ldots , \beta _{i}\}$ is an Reduced basis of lattice $L_{i}$ in $\mathbb {R}^{i}$, the proposition is obviously true when $i=1$ (see Theorem 7.5 ). If the proposition holds for $i-1$, then by Lemma 7.36,

$$\begin{aligned} |\beta _{i}^{*}|^2\le 2^{n-1}\max \{\lambda _{1},\lambda _{2},\ldots ,\lambda _{i}\}=2^{n-1}\lambda _{i}. \end{aligned}$$

Therefore, (7.78) holds for all i. The Theorem holds.

Next, we choose the Reduced basis to solve the shortest adjacent vector problem (CVP). For any given $t \in \mathbb {R}^{n}$, because there are only finite lattice points in one lattice L in the $\mathrm{Ball}(t,r)$ with t as the center and r as the radius, there is a lattice point $u_{t}$ closest to t, that is

$$\begin{aligned} |u_{t}-t|=\min _{x\in L,x\ne t}|x-t|. \end{aligned}$$

(7.79)

We use the Reduced basis to find a lattice point $\omega \in L \Rightarrow $

$$\begin{aligned} |\omega -t|\le r_{1}(n)|u_{t}-t|, \end{aligned}$$

(7.80)

$\omega $ is called an approximation of the nearest lattice point $u_{t}$, and $r_{1}(n)$ is called an approximation coefficient. According to Babai (1986), to solve the approximation of the nearest lattice point $u_{t}$, we adopt the following two technical means:

(A)

rounding off: $\forall ~x \in \mathbb {R}^{n}$, $[\beta _{1},\beta _{2},\ldots , \beta _{n}]=B$ is a Reduced base of lattice L. The discard vector $[x]_{B}$ of x is defined as follows, let
$$\begin{aligned} x=\sum _{i=1}^{n}x_{i}\beta _{i}, x_{i}\in \mathbb {R}, \end{aligned}$$

Let $\delta _{i}$ be the nearest integer to $x_{i}$, then define

$$\begin{aligned}{}[x]_{B}=\sum _{i=1}^{n}\delta _{i}\beta _{i}, \end{aligned}$$

(7.81)

$[x]_{B}$ is called the discard vector of x under base B, write $x=[x]_{B}+\{x\}_{B}$, then

$$\begin{aligned} \{x\}_{B}\in \left\{ \sum _{i=1}^{n}a_{i}\beta _{i}|-\frac{1}{2}<a_{i}\le \frac{1}{2}, 1\le i\le u \right\} . \end{aligned}$$

(B)

Adjacent plane

Let $U=\sum _{i=1}^{n-1}\mathbb {R}\beta _{i}=L(\beta _{1},\beta _{2},\ldots , \beta _{n-1})\subset \mathbb {R}^{n}$ be an $n-1$-dimensional subspace, $L^{'}=\sum _{i=1}^{n}\mathbb {Z}\beta _{i}\subset L$ be a sublattice of L, and $v \in L$, call $U+v$ is an affine plane of $\mathbb {R}^{n}$. When $x \in \mathbb {R}^{n}$ given, if the distance between x and $U+v$ is the smallest, $U+v$ is called the nearest affine plane of x.

Let $x^{'}$ be the orthogonal projection of x in the nearest affine plane $U+v$, let $y\in L^{'}$ be the vector closest to $x-v$ in $L^{'}$, and let $w=y+v$ be the approximation of the vector closest to x in L.

Let $L(\beta _{1},\beta _{2},\ldots , \beta _{n})\subset \mathbb {R}^{n}$ be a lattice, $\{\beta _{1}^{*},\beta _{2}^{*},\ldots , \beta _{n}^{*} \}$ is the corresponding orthogonal basis. $\forall ~x \in \mathbb {R}^{n}$, write $x=\sum _{i=1}^{n}x_{i}\beta _{i}^{*}$, $x_{i} \in \mathbb {R}^{n}$, $\delta _{i}$ represents the nearest integer of $x_{i}$, according to the nearest plane method, we take (see Lemma 7.43 below).

$$\begin{aligned} {\left\{ \begin{array}{ll} &{}U=L(\beta _{1}^{*},\beta _{2}^{*},\ldots , \beta _{n-1}^{*})=L(\beta _{1},\beta _{2},\ldots , \beta _{n-1})\\ &{}r=\delta _{n}\beta _{n}\in L\\ &{}x^{'}=\sum \limits _{i=1}^{n-1}x_{i}\beta _{i}^{*}+\delta _{n}\beta _{n}^{*}\\ &{}y ~\text {is a sublattice}~ \text {The grid point closest to}~x-v~\text {in} \,L^{'}=\sum \limits _{i=1}^{n-1}\mathbb {Z}\beta _{i}\\ &{}\omega =y+v \end{array}\right. } \end{aligned}$$

(7.82)

We prove that

Theorem 7.7

Let $L=L(B)\subset \mathbb {R}^{n}$ be a lattice, $B=[\beta _{1},\beta _{2},\ldots , \beta _{n}]$ is a Reduced base of L, for $\forall ~ x \in \mathbb {R}^{n}$ given, the adjacent plane method produces a lattice point $\omega =y+v$ adjacent to x in L (by (7.82)), satisfies

$$\begin{aligned} |w-x|\le 2^{\frac{n}{2}}|u_{x}-x|, \end{aligned}$$

(7.83)

where $u_{x}$ is given by Eq. (7.79) and further

$$\begin{aligned} |x-\omega |\le 2^{\frac{n}{2}-1}|\beta _{n}^{*}|. \end{aligned}$$

(7.84)

Proof

If $n=1$, then $B=\theta \in \mathbb {R}$, $\theta \ne 0$. Let $x\in \mathbb {R}, x=x_1\theta $, $L=n\theta $, then when $n\in \mathbb {Z}$,

$$\begin{aligned} |x-n\theta |=|x_{1}\theta -n\theta |=|x_{1}-n||\theta |\ge |x_{1}-\delta ||\theta |, \end{aligned}$$

where $\delta $ is the nearest integer to $x_1$, let $\omega =\delta \theta $, then

$$\begin{aligned} |x-\omega |=|x_{1}-\delta ||\theta |\le |x-n\theta |, ~\forall ~ n \in \mathbb {Z}. \end{aligned}$$

So $\omega =\delta \theta $ is the lattice point closest to x in L, so $\omega =u_{x} \in L$, that is

$$\begin{aligned} |x-\omega |=|u_{x}-x|. \end{aligned}$$

Thus (7.83) holds.

Let $n\ge 2$, we observe (see (7.82)), $v=\delta _{n}\beta _{n}$, $x^{'}=\sum _{i=1}^{n-1}x_{i}\beta _{i}^{*}+\delta _{n}\beta _{n}^{*}$, then

$$\begin{aligned} |x-x^{'}|=|x_{n}-\delta _{n}||\beta _{n}^{*}|\le \frac{1}{2}|\beta _{n}^{*}|, \end{aligned}$$

(7.85)

since the distance between affine planes $\{u+z|z\in L\}$ is at least $|\beta _{n}^{*}|$, and $|x-x^{'}|$ is the distance between x and the nearest affine plane, there is

$$\begin{aligned} |x-x^{'}|\le |u_{x}-x|. \end{aligned}$$

(7.86)

Let $\omega =y+v=y+\delta _{n}\beta _{n} \in L$, we prove that

$$\begin{aligned} |x-\omega |^{2}=|x-x^{'}|^{2}+|x^{'}-\omega |^{2}. \end{aligned}$$

(7.87)

Because $x-x^{'}=(x_{n}-\delta _{n})\beta _{n}^{*}$, $x^{'}-\omega =x^{'}-v-y \in u$, so $(x-x^{'})\bot (x^{'}-\omega )$. Therefore, by the Pythagorean theorem, (7.87) holds. By induction, we have (see (7.79))

$$\begin{aligned} |x-\omega |^{2}\le \frac{1}{4}(|\beta _{1}^{*}|^{2}+|\beta _{2}^{*}|^{2}+\cdots +|\beta _{n}^{*}|^{2}). \end{aligned}$$

By (7.71),

$$\begin{aligned} |\beta _{i}^{*}|^{2}\le 2^{n-i}|\beta _{n}^{*}|^{2}. \end{aligned}$$

Thus

$$\begin{aligned} \begin{aligned} |x-\omega |^{2}&\le \frac{1}{4}|\beta _{n}^{*}|^{2}(1+2+2^{2}+\cdots +2^{n-1})\\&=\frac{1}{4}(2^{n}-1)|\beta _{n}^{*}|^{2}\\&\le 2^{n-2}|\beta _{n}^{*}|^{2}. \end{aligned} \end{aligned}$$

There is

$$\begin{aligned} |x-\omega |\le 2^{\frac{n}{2}-1}|\beta _{n}^{*}|, \end{aligned}$$

(7.88)

that is (7.84) holds. To prove (7.83), we have two situations:

Case 1: if $u_{x} \in U+x$,

In this case, $u_{x}-v\in U \Rightarrow u_{x}-v\in L^{'}$ is the lattice point closest to $x^{'}-v$ in L, so there is

$$\begin{aligned} |x^{'}-\omega |=|x^{'}-v-y|\le C_{n-1}|x^{'}-u_{x}|\le C_{n-1}|x-u_{x}|, \end{aligned}$$

where $C_{n}=2^{\frac{n}{2}}$. By (7.87), we have

$$\begin{aligned} |x-\omega |^{2}\le (1+(C_{n-1})^{2})^{\frac{1}{2}}|x-u|< C_{n}|x-u|. \end{aligned}$$

The proposition holds.

Case 2: If $u_{x} \notin U+x$, then

$$\begin{aligned} |x-u_{x}|\ge \frac{1}{2}|\beta _{n}^{*}|. \end{aligned}$$

By (7.88), we get

$$\begin{aligned} |x-\omega |< 2^{\frac{n}{2}}|x-u_{x}|. \end{aligned}$$

Thus, Theorem 7.7 holds.

Comparing Theorems 7.6 and 7.7, when $x=0$, the approximation coefficient of Theorem 7.6 is $2^{\frac{n-1}{2}}$, for general $x\in \mathbb {R}^{n}$, there is an additional factor $\sqrt{2}$ in the approximation coefficient. Using the rounding off technique, we can give an approximation to adjacent vectors, another main result in this section is

Theorem 7.8

Let $B=[\beta _{1},\beta _{2},\ldots , \beta _{n}]$ be a Reduced basis of L, $x\in \mathbb {R}^{n}$ given arbitrarily, $u_{x}\in L$ is the lattice point closest to x, and $[x]_{B}$ is given by Eq. (7.82), then $\omega =[x]_{B}\in L$, and

$$\begin{aligned} |x-[x]_{B}|\le \left( 1+2n\left( \frac{9}{2}\right) ^{\frac{n}{2}}\right) |x-u_{x}|. \end{aligned}$$

(7.89)

By Theorem 7.8, $[x]_{B}\in L$ is an approximation of the nearest lattice point $u_{x}$, and the approximation coefficient is $\gamma _{1}(n)=1+2n\left( \frac{9}{2}\right) ^{\frac{n}{2}}$, it is a little worse than the approximation coefficients generated by adjacent planes, but the approximation vector is relatively simple. In lattice cryptosystem, $[x]_{B}$ as input information has higher efficiency. To prove Theorem 7.8, we need the following Lemma.

Lemma 7.37

Let $B=[\beta _{1},\beta _{2},\ldots , \beta _{n}]$ is a Reduced base of $\mathbb {R}^{n}$, $\theta _{k}$ represents the angle between vector $\beta _{k}$ and subspace $U_{k}$, where

$$\begin{aligned} U_{k}=\sum _{i\ne k}\mathbb {R}\beta _{i}. \end{aligned}$$

(7.90)

Then for each k, $1\le k \le n$, we have

$$\begin{aligned} \sin \theta _{k}\ge \left( \frac{\sqrt{2}}{3}\right) ^{n}. \end{aligned}$$

(7.91)

Proof

$1\le k \le n$ given, $\forall ~m \in U_{k}$, we prove

$$\begin{aligned} |\beta _{k}|\le \left( \frac{9}{2}\right) ^{\frac{n}{2}}|m-\beta _{k}|, m\in U_{k}. \end{aligned}$$

(7.92)

Because

$$\begin{aligned} \sin \theta _{k}=\min _{m\in U_{k}}\frac{|m-\beta _{k}|}{|\beta _{k}|}, \end{aligned}$$

so by (7.92), $\Rightarrow $ (7.91), the Lemma holds. To prove (7.92), let $\{\beta _{1}^{*},\beta _{2}^{*},\ldots , \beta _{n}^{*} \}$ be the orthogonal basis corresponding to the Reduced basis Reduced $\{\beta _{1},\beta _{2},\ldots , \beta _{n}\}$, then $m\in U_{k}$ can express as

$$\begin{aligned} m=\sum _{i\ne k}a_{i}\beta _{i}=\sum _{j=1}^{n}b_{j}\beta _{j}^{*},a_{i},b_{j}\in \mathbb {R}. \end{aligned}$$

Write

$$\begin{aligned} m=(a_{1},\ldots ,a_{n}) \left[ \begin{array}{cccc} \beta _{1} \\ \beta _{2} \\ \vdots \\ \beta _{n} \end{array}\right] =(a_{1},\ldots ,a_{n})U \left[ \begin{array}{cccc} \beta _{1}^{*} \\ \beta _{2}^{*} \\ \vdots \\ \beta _{n}^{*} \end{array}\right] . \end{aligned}$$

where $a_{k}=0$, U is the transition matrix of Gram–Schmidt orthogonalization (see (7.87)). Then for any $1\le j\le n$, $1\le k \le n$, there is

$$\begin{aligned} b_{j}=\sum _{i\ne k}a_{i}u_{ij},\beta _{k}=\sum _{i=1}^{n}u_{ki}\beta _{i}^{*}. \end{aligned}$$

$$\begin{aligned} m-\beta _{k}=\sum _{j=1}^{n}\gamma _{j}\beta _{j}^{*}, \text {where}~\gamma _{j}=b_{j}-u_{kj}. \end{aligned}$$

Let $a_{k}=-1$, then

$$\begin{aligned} \gamma _{j}=\sum _{i=1}^{n}a_{i}u_{ij}=a_{j}+\sum _{i=j+1}^{n}a_{i}u_{ij}. \end{aligned}$$

(7.93)

Therefore, Eq. (7.92) can be rewritten as

$$\begin{aligned} |\beta _{k}|^{2}=\sum _{j=1}^{k}u_{kj}^{2}|\beta _{j}^{*}|^{2}\le \left( \frac{9}{2}\right) ^{\frac{n}{2}}\sum _{j=1}^{n}\gamma _{j}^{2}|\beta _{j}^{*}|^{2}. \end{aligned}$$

(7.94)

Let us first prove the following assertion:

$$\begin{aligned} \sum _{j=k}^{n}\gamma _{j}^{2}\ge \left( \frac{2}{3}\right) ^{2(n-k)}. \end{aligned}$$

(7.95)

If the above formula does not hold, i.e.,

$$\begin{aligned} \sum _{j=k}^{n}\gamma _{j}^{2}< \left( \frac{2}{3}\right) ^{2(n-k)}. \end{aligned}$$

Then for all j, $k\le j \le n$, there is

$$\begin{aligned} \gamma _{j}^{2}<\left( \frac{2}{3}\right) ^{2(n-k)}\Rightarrow |\gamma _{j}|<\left( \frac{2}{3}\right) ^{(n-k)}. \end{aligned}$$

(7.96)

By (7.93),

$$\begin{aligned} {\left\{ \begin{array}{ll} &{}\gamma _{n}=a_{n}\\ &{}\gamma _{n-1}=a_{n-1}+a_{n}u_{nn-1}\\ &{}\gamma _{n-2}=a_{n-2}+a_{n-1}u_{n-1n-2}+a_{n}u_{nn-2}\\ &{}\cdots \\ &{}\gamma _{k}=a_{k}+a_{k+1}u_{k+1k}+\cdots +a_{n}u_{nk} \end{array}\right. } \end{aligned}$$

We can prove

$$\begin{aligned} |a_{j}|<\left( \frac{3}{2}\right) ^{n-j}\cdot \left( \frac{2}{3}\right) ^{n-k}. \end{aligned}$$

(7.97)

Because when $j=n$, $a_{n}=\gamma _{n}$, (7.96) ensures that (7.97) holds. Reverse induction of $j(k\le j\le n)$, by (7.93),

$$\begin{aligned} \begin{aligned} |a_{j}|&=|\gamma _{j}-\sum _{i=j+1}^{n}a_{i}u_{ij}|\le |\gamma _{j}|+\sum _{i=j+1}^{n}\frac{|a_{i}|}{2}\\&< \left( \frac{2}{3}\right) ^{n-k}+\frac{1}{2}\sum _{i=j+1}^{n}\left( \frac{3}{2}\right) ^{n-i}\left( \frac{2}{3}\right) ^{n-k}\\&=\left( \frac{2}{3}\right) ^{n-k}+\frac{1}{2}\left( \frac{2}{3}\right) ^{n-k}\sum _{i=0}^{n-j-1}\left( \frac{3}{2}\right) ^{i}\\&=\left( \frac{2}{3}\right) ^{n-k}+\left( \frac{2}{3}\right) ^{n-k}\left( \left( \frac{3}{2}\right) ^{n-j}-1\right) \\&=\left( \frac{3}{2}\right) ^{n-j}\left( \frac{2}{3}\right) ^{n-k}. \end{aligned} \end{aligned}$$

Therefore, under the assumption of (7.96), we have (7.97). Take $j=k$ in (7.97), then $|a_{k}|<1$, but $a_{k}=-1$, this contradiction shows that Formula (7.96) does not hold, thus (7.95) holds.

We now prove Formula (7.94) to complete the proof of Lemma. By Lemma 7.33,

$$\begin{aligned} |\beta _{k}^{*}|^{2} \ge 2^{j-k}|\beta _{j}^{*}|^{2},1\le j\le k\le n. \end{aligned}$$

And

$$\begin{aligned} |\beta _{k}^{*}|^{2} \le 2^{j-k}|\beta _{j}^{*}|^{2},1\le k\le j\le n. \end{aligned}$$

Therefore, there is an estimate on the left of Eq. (7.94)

$$\begin{aligned} \begin{aligned} \sum _{j=1}^{k}u_{kj}^{2}|\beta _{j}^{*}|^{2}&\le |\beta _{k}^{*}|^{2}\sum _{j=1}^{k}u_{kj}^{2}2^{k-j}\\&\le \frac{1}{4}|\beta _{k}^{*}|^{2}\sum _{j=1}^{k}2^{k-j}\\&=\frac{1}{4}|\beta _{k}^{*}|^{2}(2^{k}-1)\\&<2^{k}|\beta _{k}^{*}|^{2}. \end{aligned} \end{aligned}$$

On the other hand, there is an estimate on the right of (7.94),

$$\begin{aligned} \begin{aligned} \sum _{j=1}^{n}\gamma _{j}^{2}|\beta _{j}^{*}|^{2}&\ge \sum _{j=k}^{n}\gamma _{j}^{2}|\beta _{j}^{*}|^{2}\\&\ge \sum _{j=k}^{n}\gamma _{j}^{2}2^{k-j}|\beta _{i}^{*}|^{2}\\&\ge 2^{k-n}|\beta _{k}^{*}|^{2}\sum _{j=k}^{n}\gamma _{j}^{2}\\&\ge 2^{k-n}\left( \frac{2}{3}\right) ^{2(n-k)}|\beta _{k}^{*}|^{2}\\&\ge \left( \frac{2}{9}\right) ^{\frac{n}{2}}|\beta _{k}^{*}|^{2}. \end{aligned} \end{aligned}$$

Thus (7.94) holds, we complete the proof of Lemma 7.37.

Now we give the proof of 7.8:

Proof

(The proof of Theorem 7.8) Let $B=\{\beta _{1},\beta _{2},\ldots , \beta _{n}\}$ be a Reduced basis of lattice $L=L(B)$, $1\le k\le n$ given, $U_{k}$ is a linear subspace generated by $B-\{ \beta _{k}\}$, by Lemma 7.37, we have

$$\begin{aligned} |\beta _{1}|\le \left( \frac{9}{2}\right) ^{\frac{n}{2}}|m-\beta _{k}|,\forall ~m\in U_{k}. \end{aligned}$$

(7.98)

Let $x\in \mathbb {R}^{n}$, $\omega =[x]_{B}\in L$, then

$$\begin{aligned} x-\omega =x-[x]_{B}=\sum _{i=1}^{n}c_{i}\beta _{i}, ~|c_{i}|\le \frac{1}{2}(1\le i\le n). \end{aligned}$$

Let $u_{x}$ be the nearest grid point to x in L, and let

$$\begin{aligned} u_{x}-\omega =\sum _{i=1}^{n}a_{i}\beta _{i},a_{i} \in \mathbb {Z}. \end{aligned}$$

We prove

$$\begin{aligned} |u_{x}-\omega |\le 2n \left( \frac{9}{2}\right) ^{\frac{n}{2}}|u_{x}-x|. \end{aligned}$$

(7.99)

Might as well make $u_{x}\ne \omega $, and suppose

$$\begin{aligned} |a_{k}\beta _{k}|=\max _{1\le j\le n}|a_{j}\beta _{j}|>0. \end{aligned}$$

Obviously,

$$\begin{aligned} |u_{x}-\omega |\le n|a_{k}\beta _{k}|. \end{aligned}$$

(7.100)

On the other hand,

$$\begin{aligned} u_{x}-x=(u_{x}-\omega )+(\omega -x)=\sum _{i=1}^{n}(a_{i}+c_{i})\beta _{i}=(a_{k}+c_{k})(\beta _{k}-m). \end{aligned}$$

where

$$\begin{aligned} m=-\frac{1}{a_{k}+c_{k}}\sum _{j\ne k}(a_{j}+c_{j})\beta _{j}\in U_{k}. \end{aligned}$$

By (7.99),

$$\begin{aligned} |u_{x}-x|=|a_{k}+c_{k}||\beta _{k}-m|\ge \frac{1}{2}\left( \frac{2}{9}\right) ^{\frac{n}{2}}|\beta _{k}||a_{k}|. \end{aligned}$$

There is

$$\begin{aligned} |a_{k}\beta _{k}|\le 2 \left( \frac{9}{2}\right) ^{\frac{n}{2}}|u_{x}-x|. \end{aligned}$$

$$\begin{aligned} |u_{x}-\omega |\le 2n \left( \frac{9}{2}\right) ^{\frac{n}{2}}|u_{x}-x|. \end{aligned}$$

That is (7.99) holds, finally,

$$\begin{aligned} |x-\omega |\le |x-u_{x}|+|u_{x}-\omega |\le \left( 1+2n \left( \frac{9}{2}\right) ^{\frac{n}{2}}\right) |x-u_{x}|. \end{aligned}$$

We complete the proof of Theorem 7.8.

7.6 GGH/HNF Cryptosystem

Lattice-based cryptosystem is the main research object of postquantum cryptography. Since it was first proposed in 1996, it has only a history of more than 20 years. Among them, the representative technologies are Ajtai-Dwork cryptosystem, GGH cryptosystem, McEliece-Niederreiter cryptosystem and NTRU cryptosystem based on algebraic code theory. We will introduce them, respectively, below.

GGH cryptosystem is a cryptosystem based on lattice theory proposed by Goldreich, Goldwasser and Halevi in 1997. It is generally considered that it is a new public key cryptosystem to replace RSA in the postquantum cryptosystem era.

Let $L\subset \mathbb {Z}^{n}$ be an integer lattice, B and R are two generating matrices of L, that is

$$L=L(B)=L(R).$$

Because there is a unique HNF base in L (see Lemma 3.4). Let $B=\mathrm{HNF}(L)$ be HNF matrix, B as public key and R as private key. Let $v\in \mathbb {Z}^{n}$ be an integer point, $e\in \mathbb {R}^{n}$ is an error vector. Let $\sigma $ be a parameter vector. Take $e=\sigma $ or $e=-\sigma $, they each chose with a probability of $\frac{1}{2}$.

Encryption: for the plaintext $v\in \mathbb {Z}^{n}$ encoded and input and the error vector randomly selected according to the parameter vector $\sigma $, the public key B is used for encryption. The encryption function $f_{B,\sigma }$ is defined as

$$\begin{aligned} f_{B,\sigma }(v,e)=Bv+e=c\in \mathbb {R}^{n}. \end{aligned}$$

(7.101)

Decryption: decrypt cryptosystem text c with private key R, because $c\in \mathbb {R}^{n}$ , $R=[\alpha _{1},\alpha _{2},\ldots ,\alpha _{n}]$, then c can be expressed in $\{\alpha _{1},\alpha _{2},\ldots ,\alpha _{n}\}$ linearity,

$$c=\sum _{i=1}^{n}x_{i}\alpha _{i},x_{i}\in \mathbb {R}.$$

Let $\delta _{i}$ be the nearest integer to $x_{i}$, define (see (7.81))

$$\begin{aligned}{}[c]_{R}=\sum _{i=1}^{n}\delta _{i}\alpha _{i}\in L. \end{aligned}$$

(7.102)

Define the decryption function as

$$\begin{aligned} {\left\{ \begin{array}{ll} \ f_{B,\sigma }^{-1}(c)=B^{-1}[c]_{R}=v, \\ \ e=c-Bv. \end{array}\right. } \end{aligned}$$

(7.103)

In order to verify the correctness of decryption function $f_{B,\sigma }^{-1}$, we first prove the following simple Lemma. For any $x\in \mathbb {R}^{n}$, and $R=[\alpha _{1},\alpha _{2},\ldots ,\alpha _{n}] \in \mathbb {R}^{n\times n}$ is any set of bases of $\mathbb {R}^{n}$, if $x=(a_{1},a_{2},\ldots , a_{n})\in \mathbb {R}^{n}$, $\gamma _{i}$ represents the integer closest to $a_{i}$, then define (see (7.7))

$$\begin{aligned}{}[x]=(\gamma _{1},\gamma _{2},\ldots ,\gamma _{n})\in \mathbb {Z}^{n}. \end{aligned}$$

(7.104)

Write $x=\sum _{i=1}^{n}x_{i}\alpha _{i}$, $\delta _{i}$ is the nearest integer to $x_{i}$, then define (see (7.102))

$$\begin{aligned}{}[x]_{R}=\sum _{i=1}^{n}\delta _{i}\alpha _{i}\in L(R). \end{aligned}$$

(7.105)

Lemma 7.38

For $\forall ~ x\in \mathbb {R}^{n}$, $R\in \mathbb {R}^{n\times n}$ is a set of bases of $\mathbb {R}^{n}$, we have

$$[x]_{R}=R[R^{-1}x].$$

Proof

Write

$$x= \left[ \begin{array}{cccc} a_{1} \\ a_{2} \\ \vdots \\ a_{n} \end{array}\right] \in \mathbb {R}^{n}\Rightarrow [x]= \left[ \begin{array}{cccc} \delta _{1} \\ \delta _{2} \\ \vdots \\ \delta _{n} \end{array}\right] \in \mathbb {Z}^{n},|a_{i}-\delta _{i}|\le \frac{1}{2}.$$

If $x=\sum _{i=1}^{n}x_{i}\alpha _{i}$ , $R=[\alpha _{1},\alpha _{2},\ldots ,\alpha _{n}]$, then

$$x=R \left[ \begin{array}{cccc} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array}\right] ,\text {and} ~[x]_{R}=R \left[ \begin{array}{cccc} \delta _{1} \\ \delta _{2} \\ \vdots \\ \delta _{n} \end{array}\right] , ~\delta _{i}~\text {is the nearest integer to}~x_{i}.$$

Thus

$$R^{-1}[x]_{R}= \left[ \begin{array}{cccc} \delta _{1} \\ \delta _{2} \\ \vdots \\ \delta _{n} \end{array}\right] =[R^{-1}x].$$

Lemma 7.38 holds.

Theorem 7.9

Let $L=L(R)=L(B)\subset \mathbb {Z}^{n}$ be an integer lattice, B is the public key, R is the private key, $v\in \mathbb {Z}^{n}$ is plaintext, e is the error vector. If and only if $[R^{-1}e]\ne 0$,

$$f_{B,\sigma }^{-1}(c)\ne v.$$

Proof

By the definition, cryptosystem text $c=Bv+e=f_{B,\sigma }(v,e)$, and

$$\begin{aligned} f_{B,\sigma }^{-1}(c)\equiv B^{-1}[c]_{R}=B^{-1}R[R^{-1}c]=T[R^{-1}c]. \end{aligned}$$

(7.106)

where $T=B^{-1}R\in \mathbb {R}^{n\times n}$ is a unimodular matrix. Because $L(B)=L(R),\Rightarrow $

$$B=RU,U\in SLn(\mathbb {Z}).$$

$$B^{-1}R=UR^{-1}R=U=T,$$

that is T is a unimodular matrix. By (7.106),

$$\begin{aligned} \begin{aligned} T[R^{-1}c]&=T[R^{-1}(Bv+e)]\\&=T[R^{-1}Bv+R^{-1}e]\\&=T[T^{-1}v+R^{-1}e]. \end{aligned} \end{aligned}$$

Because T is a unimodular matrix, $v\in \mathbb {Z}^{n}$, so

$$\begin{aligned}{}[T^{-1}v+R^{-1}e]=T^{-1}v+[R^{-1}e]. \end{aligned}$$

(7.107)

Thus

$$T[R^{-1}c]=v+T[R^{-1}e].$$

That is

$$f_{B,\sigma }^{-1}(c)=v+T[R^{-1}e].$$

Because T is a unimodular matrix, $T[R^{-1}e]=0\Leftrightarrow [R^{-1}e]=0$, so the Theorem holds.

By Theorem 7.9, whether the GGH cryptographic mechanism is correct or not depends entirely on whether $[R^{-1}e]$ is a 0 vector, that is

$$\begin{aligned} f_{B,\sigma }^{-1}(c)=v\Leftrightarrow [R^{-1}e]=0. \end{aligned}$$

(7.108)

Therefore, when the private key R is given, the selection of error vector e and parameter vector $\sigma $ becomes the key to the correctness of GGH password. Notice that (7.106), if we decrypt with public key B, then

$$[B^{-1}c]=[B^{-1}(Bv+e)]=[v+B^{-1}e]=v+[B^{-1}e].$$

Therefore, the basic condition for the security and accuracy of GGH password is

$$\begin{aligned} {\left\{ \begin{array}{ll} \ [R^{-1}e]=0 \\ \ [B^{-1}e]\ne 0. \end{array}\right. } \end{aligned}$$

(7.109)

Because the public key B we choose is HNF matrix, $[B^{-1}e]\ne 0$ is easy to satisfy. Let $B=(b_{ij})_{n\times n}\Rightarrow B^{-1}=(c_{ij})_{n\times n}$. Where $c_{ii}=b_{ii}^{-1}$. Let $e=(e_{1},e_{2},\ldots ,e_{n})$, each $e_{i}$ has the same absolute value, that is $|e_{i}|=\sigma $ , $\sigma $ is the parameter. Thus, $2|e_{n}|>b_{nn}\Rightarrow [B^{-1}e]\ne 0$. Let’s focus on $[R^{-1}e]=0$.

$\forall ~x=(x_{1},x_{2},\ldots , x_{n})\in \mathbb {R}^{n}$, define the $L_{1}$ norm $|x|_{1}$ and $L_{\infty }$ norm $|x|_{\infty }$ of x as

$$\begin{aligned} |x|_{\infty }=\max _{1\le i\le n}|x_{i}|,|x|_{1}=\sum _{i=1}^{n}|x_{i}|. \end{aligned}$$

(7.110)

Lemma 7.39

Let $R\in \mathbb {R}^{n\times n}$ be a reversible square matrix, $R^{-1}= \left[ \begin{array}{cccc} \alpha _{1} \\ \alpha _{2} \\ \vdots \\ \alpha _{n} \end{array}\right] $, where $\alpha _{i}$ is the row vector of $R^{-1}$. $e=(e_{1},e_{2},\ldots , e_{n})\in \mathbb {R}^{n}$, $|e_{i}|=\sigma $, $\forall ~ 1\le i\le n$, let

$$\begin{aligned} \rho =\max _{1\le i\le n}|\alpha _{i}|(|\alpha _{i}|_{1}) \end{aligned}$$

(7.111)

be the maximum of the $L_{1}$ norm of n row vectors of $R^{-1}$, then when $\sigma <\frac{1}{2\rho }$, we have $[R^{-1}e]=0$.

Proof

Suppose $\alpha _{i}=(c_{i1},c_{i2},\ldots ,c_{in})$, the i-th component of $R^{-1}e$ can be written as

$$\left| \sum _{i=1}^{n}c_{ij}e_{j}\right| \le \sigma \sum _{j=1}^{n}|c_{ij}|=\sigma |\alpha _{i}|_{\infty }\le \sigma \rho .$$

If $\sigma <\frac{1}{2\rho }$, then each component of $R^{-1}e$ is $<\frac{1}{2}$, there is $[R^{-1}e]=0$.

Lemma 7.40

$R\in \mathbb {R}^{n\times n}$ , $R^{-1}= \left[ \begin{array}{cccc} \alpha _{1} \\ \alpha _{2} \\ \vdots \\ \alpha _{n} \end{array}\right] $, let $\max _{1\le i\le n}|\alpha _{i}|_{\infty }=\frac{\gamma }{\sqrt{n}}$, then the probability of $[R^{-1}e]\ne 0$ is

$$\begin{aligned} P\{[R^{-1}e]\ne 0\}\le 2n\exp \left( -\frac{1}{8\sigma ^{2}\gamma ^{2}} \right) . \end{aligned}$$

(7.112)

where $\sigma $ is the parameter, error vector $e=(e_{1},\ldots ,e_{n})$, $|e_{i}|=\sigma $.

Proof

Let $R^{-1}=(c_{ij})_{n\times n}$, $R^{-1}e= \left[ \begin{array}{cccc} a_{1} \\ a_{2} \\ \vdots \\ a_{n} \end{array}\right] $, where $a_{i}=\sum _{j=1}^{n}c_{ij}e_{j}$.

Because $|c_{ij}|\le \frac{\gamma }{\sqrt{n}}$, $|e_{j}|=\sigma $, then $c_{ij}e_{j}$ is in interval $[-\frac{\gamma \sigma }{\sqrt{n}},\frac{\gamma \sigma }{\sqrt{n}}]$; therefore, by Hoeffding inequality, we have

$$P\left\{ |a_{i}|>\frac{1}{2}\right\} =P\left\{ \left| \sum _{j=1}^{n}c_{ij}e_{j}\right| >\frac{1}{2}\right\} <2\exp \left( -\frac{1}{8\sigma ^{2}\gamma ^{2}}\right) .$$

To satisfy $[R^{-1}e]\ne 0$, then only one of the above conditions $\{|a_{i}|>\frac{1}{2}\}$ is true. Thus

$$\begin{aligned} \begin{aligned} P\{[R^{-1}e]\ne 0\}&=P\left\{ \bigcup _{i=1}^{n}\left\{ |a_{i}|>\frac{1}{2}\right\} \right\} \\&\le \sum _{i=1}^{n}P\left\{ |a_{i}|>\frac{1}{2}\right\} \\&<2n\exp \left( -\frac{1}{8\sigma ^{2}\gamma ^{2}}\right) . \end{aligned} \end{aligned}$$

The Lemma holds.

Corollary 7.8

For any given $\varepsilon >0$, when parameter $\sigma $ satisfies

$$\begin{aligned} \sigma \le \left( \gamma \sqrt{8\log \frac{2n}{\varepsilon }}\right) ^{-1}\Rightarrow P\{[R^{-1}e]\ne 0\}<\varepsilon . \end{aligned}$$

(7.113)

In order to have a direct impression of Eq. (7.113), let’s give an example. Let $n=120$, $\varepsilon =10^{-5}$, when the elements of matrix $R^{-1}=(c_{ij})_{n\times n}$ change in the interval $[-4,4]$, that is $-4\le c_{ij}\le 4$, then it can be verified that the maximum $L_{\infty }$ norm of the row vector of $R^{-1}$ is approximately equal to $\frac{1}{30\times \sqrt{120}}$, thus $\gamma =\frac{1}{30}$, by Corollary, when $\sigma \le (\frac{1}{30}\sqrt{8\log 240\times 10^{5}})^{-1}\approx \frac{30}{11.6}\approx 2.6$, we have

$$P\{[R^{-1}e]\ne 0\}<10^{-5}.$$

It can be seen from the above analysis that GGH cryptosystem does not effectively solve the selection of private key R, public key B, especially parameter $\sigma $ and error vector. In 2001, Professor Micciancio of the University of California, San Diego further improved GGH cryptosystem by using HNF basis and adjacent plane method. In order to introduce GGH/HNF cryptosystem, we review several important results in the previous sections.

Lemma 7.41

Let $L=L(B)\subset \mathbb {R}^{n}$ be a lattice, $B=[\beta _{1},\beta _{2},\ldots ,\beta _{n}]$ is the generating base, $B^{*}=[\beta _{1}^{*},\beta _{2}^{*},\ldots ,\beta _{n}^{*}]$ is the corresponding orthogonal basis, $\lambda _{1}=\lambda (L)$ is the minimum distance of L, then

(i)

$$\begin{aligned} \lambda _{1}=\lambda (L)\ge \min _{1\le i\le n}|\beta _{i}^{*}|. \end{aligned}$$

(7.114)
For $L=L(B)$, take parameter $\rho =\rho (B)$ as
$$\begin{aligned} \rho =\frac{1}{2}\min _{1\le i\le n}|\beta _{i}^{*}|. \end{aligned}$$

(7.115)
Then for any $x\in \mathbb {R}^{n}$, there is at most one grid point
$$\begin{aligned} \alpha \in L\Rightarrow |x-\alpha |<\rho . \end{aligned}$$

(7.116)
(ii)

Suppose $L\subset \mathbb {Z}^{n}$ is an integer lattice, then L has a unique HNF base B, that is $L=L(B)$, $B=(b_{ij})_{n\times n}$, satisfies
$$0\le b_{ij}<b_{ii}, \mathrm{when} ~1\le i<j\le n , b_{ij}=0 , \mathrm{when} ~1\le j<i\le n.$$
That is, B is an upper triangular matrix, and the corresponding orthogonal basis $B^{*}$ of B is a diagonal matrix, that is
$$B^{*}=\mathrm{diag}\{b_{11},b_{22},\ldots ,b_{nn}\}.$$

Proof

Equation (7.114) is given by Lemma 7.18 and the property (ii) is given by Lemma 7.26. We only prove that if there is lattice point $\alpha \in L\Rightarrow |x-\alpha |<\rho $, then $\alpha $ is the only one. Let $\alpha _{1}\in L$ , $\alpha _{2}\in L$, and

$$|\alpha _{1}-x|<\rho ,|\alpha _{2}-x|<\rho \Rightarrow |\alpha _{1}-\alpha _{2}|<2\rho =\min _{1\le i\le n}|\beta _{i}^{*}|\le \lambda _{1}.$$

Because $\alpha _{1}-\alpha _{2}\in L$, this contradicts the definition of $\lambda _{1}$. There is $\alpha _{1}=\alpha _{2}$.

In the previous section, we introduced Babai’s adjacent plane method (see (7.82)). The distance between two subsets $A_{1}$ and $A_{2}$ in $\mathbb {R}^{n}$ is defined as

$$|A_{1}-A_{2}|=\min \{|x-y||x\in A_{1},y\in A_{2}\}.$$

$x\in \mathbb {R}^{n}$ is a vector, $A\subset \mathbb {R}^{n}$ is a subset, the distance between x and A is defined as

$$|x-A|=\min \{|x-y||y\in A\}.$$

Suppose $L\in \mathbb {R}^{n}$ , $B=[\beta _{1},\beta _{2},\ldots ,\beta _{n}]$ is a generating base, $B^{*}=[\beta _{1}^{*},\beta _{2}^{*},\ldots ,\beta _{n}^{*}]$ is the corresponding orthogonal basis. Define subspace

$$ {\left\{ \begin{array}{ll} \ U=L(\beta _{1},\beta _{2},\ldots ,\beta _{n-1})=\mathbb {R}^{n-1}, &{} L'=\sum _{i=1}^{n-1}\mathbb {Z}\beta _{i}~\text {is a sub-lattice.} \\ \ A_{v}=U+v, &{} v\in L. \end{array}\right. }$$

$A_{v}$ is called an affine plane with v as the representative element. Any $x\in \mathbb {R}^{n}$, let $A_{v}$ be the affine plane closest to x, that is

$$|x-A_{v}|=\min \{|x-A_{\alpha }||\alpha \in L\}.$$

Let $x'$ be the orthogonal projection of x on $A_{v}$. Because $x'-v\in U=\mathbb {R}^{n-1}$. Recursively let $y\in L'$ be the nearest lattice point to $x'-v$. Then we define the adjacent plane operator $\tau _{B}$ of x under base B as

$$\begin{aligned} \tau _{B}(x)=w=y+v\in L. \end{aligned}$$

(7.117)

Lemma 7.42

Under the above definition, if $v_{1},v_{2}\in L$, and $A_{v_{1}}\ne A_{v_{2}}$, then

$$\begin{aligned} |A_{v_{1}}-A_{v_{2}}|\ge |\beta _{n}^{*}|. \end{aligned}$$

(7.118)

Proof

$v_{1},v_{2}\in L$, then it can be given by the linear combination of $\{\beta _{1}^{*},\beta _{2}^{*},\ldots ,\beta _{n}^{*}\}$, that is

$$ {\left\{ \begin{array}{ll} \ v_{1}=\sum _{i=1}^{n}a_{i}\beta _{i}^{*}, &{} \text {where}~a_{i}\in \mathbb {R},a_{n}\in \mathbb {Z}. \\ \ v_{2}=\sum _{i=1}^{n}b_{i}\beta _{i}^{*}, &{} \text {where}~b_{i}\in \mathbb {R},b_{n}\in \mathbb {Z}. \end{array}\right. } $$

In order to prove the n-th component, $a_{n}$ and $b_{n}$ are integers, let

$$v_{1}=\sum _{i=1}^{n}a_{i}^{*}\beta _{i},v_{2}=\sum _{i=1}^{n}b_{i}^{*}\beta _{i},a_{i}^{*},b_{i}^{*}\in \mathbb {Z}.$$

Therefore,

$$a_{n}=\frac{\langle v_{1},\beta _{n}^{*}\rangle }{|\beta _{n}^{*}|^{2}} =\frac{\langle a_{n}^{*}\beta _{n},\beta _{n}^{*}\rangle }{|\beta _{n}^{*}|^{2}}=a_{n}^{*}\in \mathbb {Z}.$$

The above equation uses Eq. (7.52), which can prove $b_{n}\in \mathbb {Z}$ in the same way. By condition $v_{1}-v_{2}\notin U$, then $a_{n}=b_{n}$, therefore

$$|A_{v_{1}}-A_{v_{2}}|=|a_{n}-b_{n}||\beta _{n}^{*}|\ge |\beta _{n}^{*}|.$$

We have completed the proof of Lemma.

Lemma 7.43

Under the above definitions and symbols, suppose $x\in \mathbb {R}^{n}, x=\sum _{i=1}^{n}\gamma _{i}\beta _{i}^{*}$, $\delta $ is the nearest integer to $\gamma _{n}$, then

(i)

$$\begin{aligned} v=\delta \beta _{n}, x'=\sum _{i=1}^{n-1}\gamma _{i}\beta _{i}^{*}+\delta \beta _{n}^{*}. \end{aligned}$$

(7.119)
That is, the affine plane closest to x is $A_{\beta _{n}}$, the orthogonal projection of x on $A_{v}$ is $x'$.
(ii)

Let $u_{x}\in L$ be the lattice point closest to x, then
$$\begin{aligned} |x-x'|\le |x-u_{x}|. \end{aligned}$$

(7.120)

Proof

Take $v=\delta \beta _{n}$, then $v\in L$, we want to prove that the distance between x and $A_{v}$ is the smallest. Because $x=\sum _{i=1}^{n}\gamma _{i}\beta _{i}^{*}$, so (see (7.119))

$$x-v=\sum _{i=1}^{n-1}\gamma '_{i}\beta _{i}^{*}+(\gamma _{n}-\delta )\beta _{n}^{*},$$

$$\Longrightarrow |x-A_{v}|=|x-v-U|\le |\gamma _{n}-\delta ||\beta _{n}^{*}|\le \frac{1}{2}|\beta _{n}^{*}|.$$

Let $v_{1}\in L,v-v_{1}\notin U$, by trigonometric inequality,

$$|x-A_{v_{1}}|\ge |A_{v_{1}}-A_{v}|-|x-A_{v}|\ge |\beta _{n}^{*}|-\frac{1}{2}|\beta _{n}^{*}|= \frac{1}{2}|\beta _{n}^{*}|\ge |x-A_{v}|.$$

So it is correct to take $v=\delta \beta _{n}$. Secondly, we prove that the orthogonal projection $x'$ of x and affine plane $A_{v}$ is

$$x'=\sum _{i=1}^{n-1}\gamma _{i}\beta _{i}^{*}+\delta \beta _{n}^{*}.$$

Let’s first prove $x'\in A_{v}$. Because $v=\delta \beta _{n}$, and

$$\begin{aligned} \beta _{n}=\sum _{i=1}^{n-1}c_{i}\beta _{i}^{*}+\beta _{n}^{*}\Rightarrow \delta \beta _{n}=\sum _{i=1}^{n-1}\delta c_{i}\beta _{i}^{*}+\delta \beta _{n}^{*}=v. \end{aligned}$$

(7.121)

Thus

$$x'-v=\sum _{i=1}^{n-1}(\gamma _{i}-\delta c_{i})\beta _{i}^{*}\in U.$$

That is $x'\in U+v=A_{v}$. And $x-x'=\delta \beta _{n}^{*}\Rightarrow (x-x')\bot U$. Because

$$U\bigcap A_{v}=\varnothing .$$

Then $A_{v}$ and U are two parallel planes, thus $(x-x')\bot A_{v}$. This proves that the orthogonal projection of x on $A_{v}$ is $x'$, and thus (i) holds.

The proof of (ii) is direct. By the definition of x and any affine plane $A_{\alpha }$, the distance of $\alpha \in L$ satisfies

$$|x-\alpha |\ge |x-A_{\alpha }|.$$

When $\alpha =v$, because $(x-x')\bot A_{v}$, thus

$$|x-x'|=|x-A_{v}|\le |x-A_{\alpha }|,\forall ~\alpha \in L.$$

Let $u_{x}\in L$ be the lattice point closest to x, then take $\alpha =u_{x}$, there is

$$|x-x'|\le |x-A_{u_{x}}|\le |x-u_{x}|.$$

The Lemma holds.

Lemma 7.44

Let $L=L(B)\subset \mathbb {R}^{n}$ be a lattice, $x\in \mathbb {R}^{n},\alpha \in L$. If $|x-\alpha |<\rho $, where $\rho =\frac{1}{2}\min \{|\beta _{i}^{*}||1\le i\le n\}$, then the nearest plane operator $\tau _{B}$ has

$$\begin{aligned} \tau _{B}(x)=\alpha . \end{aligned}$$

(7.122)

Proof

Because of

$$|x-A_{\alpha }|\le |x-\alpha |<\rho .$$

By Lemma 7.42, $A_{\alpha }$ is the plane $A_{v}$ closest to x, that is $A_{\alpha }=A_{v}$. And $\tau _{B}(x)=w=y+v$, then we have

$$\begin{aligned} |x-w|\le |x-\alpha |<\rho . \end{aligned}$$

(7.123)

By Lemma 7.41, we have $\alpha =w=\tau _{B}(x)$. The Lemma holds!

Now let’s introduce the workflow of GGH/HNF password:

1.

$L=L(B)=L(R)\subset \mathbb {Z}^{n}$ is an integer lattice, $R=[r_{1},r_{2},\ldots , r_{n}]$ is the private key, $B=[\beta _{1},\beta _{2},\ldots ,\beta _{n}]$ is the public key, and is the HNF basis of L, where
$$B^{*}=\mathrm{diag}\{b_{11},b_{22},\ldots ,b_{nn}\}.$$
We choose the private key R as a particularly good base, that is $\rho =\frac{1}{2}\min \{|r_{i}^{*}||1\le i\le n\}$. Specially, public key B satisfies
$$\frac{1}{2}b_{ii}<\rho , \forall ~1\le i\le n.$$
2.

Let $v\in \mathbb {Z}^{n}$ be an integer, $e\in \mathbb {R}^{n}$ is the error vector, satisfies $|e|<\rho $.
3.

Encryption: after any plaintext information $v\in \mathbb {Z}^{n}$ and error vector e are selected, with $\rho $ as the parameter, the encryption function $f_{B,\rho }$ is defined as
$$f_{B,\rho }(v,e)=Bv+e=c.$$
4.

Decryption: We decrypt cryptosystem text c with private key R. Decryption is transformed into
$$f_{B,\rho }^{-1}(v,e)=B^{-1}\tau _{R}(c).$$
where $\tau _{R}$ is the nearest plane operator defined by R.

By Lemma 7.44, when $|e|<\rho $, $\Rightarrow |c-Bv|=|e|<\rho $, thus
$$\begin{aligned} B^{-1}\tau _{R}(c)=B^{-1}\tau _{R}(Bv+e)=B^{-1}Bv=v. \end{aligned}$$

(7.124)
This ensures the correctness of decryption.

Comparing GGH with GGH/HNF, they choose the same encryption function, but the decryption transformation is very different. GGH adopts Babai’s rounding off method, while GGH/HNF adopts Babai’s nearest plane method. There is a certain difference between the two at the selection point of error vector e. The error vector e of GGH depends on each component of parameter $\sigma $ and e, and $\pm \sigma $. The error vector e of GGH/HNF depends on the parameter $\rho $ as long as the length is less than $\rho $. Therefore, GGH/HNF has greater flexibility in the selection of error vector e.

Next, we explain the reason why public key B chooses HNF basis. For any entire lattice $L=L(B)\subset \mathbb {Z}^{n}$, $B^{*}=[\beta _{1}^{*},\beta _{2}^{*},\ldots ,\beta _{n}^{*}]$ is the corresponding orthogonal basis. Using the congruence relation $\mod L$, we define an equivalent relation in $\mathbb {R}^{n}$, which is also the equivalent relation between integral points in $\mathbb {Z}^{n}$. By Lemma 7.24, quotient group $\mathbb {Z}^{n}/L$ is a finite group, and $|\mathbb {Z}^{n}/L|=d(L)$. We further give a set of representative elements of $\mathbb {Z}^{n}/L$. Let

$$\begin{aligned} F(B^{*})=\left\{ \sum _{i=1}^{n}x_{i}\beta _{i}^{*}|0\le x_{i}<1\right\} \end{aligned}$$

(7.125)

be a parallelogram, it can be compared with the base area $F=F(B)$ of $\mathbb {R}^{n}/L$ (see Lemma 7.16).

$$F=F(B)=\left\{ \sum _{i=1}^{n}x_{i}\beta _{i}|0\le x_{i}<1\right\} .$$

F is just a quadrilateral.

Lemma 7.45

For any integer point $\alpha \in \mathbb {Z}^{n}$, there is a unique $w \in F(B^{*})$ such that

$$\alpha \equiv w({{\,\mathrm{mod}\,}}L).$$

Proof

$\alpha \in \mathbb {Z}^{n}$ is a integer point, then $\alpha $ can be expressed as a linear combination of $B^{*}$, write

$$\alpha =\sum _{i=1}^{n}a_{i}\beta _{i}^{*},a_{i}\in \mathbb {R}.$$

$[a_{i}]$ represents the largest integer not greater than $a_{i}$, Suppose

$$\begin{aligned} w=\sum _{i=1}^{n}a_{i}\beta _{i}^{*}-\sum _{i=1}^{n}[a_{i}]\beta _{i}. \end{aligned}$$

(7.126)

Then

$$\alpha -w=\sum _{i=1}^{n}[a_{i}]\beta _{i}\in L\Rightarrow \alpha \equiv w({{\,\mathrm{mod}\,}}L).$$

We prove that $w\in F(B^{*})$, linearly express w with the basis vector of $B^{*}$,

$$w=\sum _{i=1}^{n}b_{i}\beta _{i}^{*}.$$

We can only prove that $0\le b_{i}<1$. By (7.52), it is not difficult to have

$$b_{n}=\frac{\langle w,\beta _{n}^{*} \rangle }{|\beta _{n}^{*}|^{2}} =\frac{(a_{n}-[a_{n}])|\beta _{n}^{*}|^{2}}{|\beta _{n}^{*}|^{2}}=a_{n}-[a_{n}].$$

Thus $0\le b_{n}<1$, It is not difficult to verify that $\forall ~ 1\le i\le n$, we have $0\le b_{i}<1$ by induction, that is $w\in F(B^{*})$. To prove uniqueness. Let

$$w=\sum _{i=1}^{n}a_{i}\beta _{i}^{*},\mathrm{where}~|a_{i}|<1.$$

We prove that if

$$\begin{aligned} w=0({{\,\mathrm{mod}\,}}L)\Leftrightarrow w=0. \end{aligned}$$

(7.127)

Write $w=\sum _{i=1}^{n}b_{i}\beta _{i}$, then by (7.52), there is

$$a_{n}=\frac{\langle w,\beta _{n}^{*}\rangle }{\langle \beta _{n}^{*},\beta _{n}^{*}\rangle }=\frac{b_{n}|\beta _{n}^{*}|^{2}}{|\beta _{n}^{*}|^{2}} =b_{n}.$$

Because of $w\in L$ and $|b_{n}|<1\Rightarrow b_{n}=0$. It is not difficult to have $b_{1}=b_{2}=\cdots =b_{n}=0$ by induction. That is $w=0$, (7.127) holds.

$\alpha \in \mathbb {Z}^{n}$, if $w_{1}\in F(B^{*}), w_{2}\in F(B^{*}), \alpha \equiv w_{1}({{\,\mathrm{mod}\,}}L), \alpha \equiv w_{2}({{\,\mathrm{mod}\,}}L)$, then

$$w_{1}-w_{2}\equiv 0({{\,\mathrm{mod}\,}}L).$$

By (7.127), there is $w_{1}=w_{2}$. As can be seen from the above, $w_{1}\in F(B^{*}),w_{2}\in F(B^{*})$, then when $w_{1}\ne w_{2}$, there is $w_{1}\not \equiv w_{2}({{\,\mathrm{mod}\,}}L)$, that is, the points in $F(B^{*})$ are not congruent under ${{\,\mathrm{mod}\,}}L$, the Lemma holds.

From the above lemma, any two points in parallelogram $F(B^{*})$ are not congruent ${{\,\mathrm{mod}\,}}L$, therefore, for not congruent lattice points $\alpha _{1},\alpha _{2}\in L$, then

$$\{F(B^{*})+\alpha _{1}\}\cap \{F(B^{*})+\alpha _{2}\}=\varnothing .$$

Thus, $\mathbb {R}^{n}$ can be split into

$$\begin{aligned} \mathbb {R}^{n}=\cup _{\alpha \in L}F(B^{*})+\alpha . \end{aligned}$$

(7.128)

By Lemma 7.45, any $\alpha \in \mathbb {Z}^{n}$, there exists a unique $w\in F(B^{*})\Rightarrow \alpha \equiv w({{\,\mathrm{mod}\,}}L)$, define

$$w=\alpha {{\,\mathrm{mod}\,}}L.$$

Then $\alpha \rightarrow \alpha {{\,\mathrm{mod}\,}}L$ gives a surjection of $\mathbb {Z}^{n}\rightarrow \mathbb {Z}^{n}\cap F(B^{*})$, this mapping is a 1-1 correspondence of $\mathbb {Z}^{n}/L\rightarrow \mathbb {Z}^{n}\cap F(B^{*})$. Because if $\alpha ,\beta \in \mathbb {Z}^{n}$, then

$$ {\left\{ \begin{array}{ll} \alpha \equiv \beta ({{\,\mathrm{mod}\,}}L)\Rightarrow \alpha {{\,\mathrm{mod}\,}}L =\beta {{\,\mathrm{mod}\,}}L \in \mathbb {Z}^{n}\cap F(B^{*}) \\ \alpha \not \equiv \beta ({{\,\mathrm{mod}\,}}L)\Rightarrow \alpha {{\,\mathrm{mod}\,}}L \ne \beta {{\,\mathrm{mod}\,}}L . \end{array}\right. } $$

By Lemma 7.24, we obviously have the following Corollary.

Corollary 7.9

If $L=L(B)\subset \mathbb {Z}^{n}$ is an integer lattice, then $F(B^{*})\cap \mathbb {Z}^{n}$ is a representative element set of $\mathbb {Z}^{n}/L$, and

$$\begin{aligned} |F(B^{*})\cap \mathbb {Z}^{n}|=d(L). \end{aligned}$$

(7.129)

If B is the HNF basis of the whole lattice L, then $B^{*}=\mathrm{diag}\{b_{11},b_{22},\ldots ,b_{nn}\}$, thus, parallelogram $F(B^{*})$ takes the simplest form:

$$\begin{aligned} F(B^{*})=\{(x_{1},x_{2},\ldots ,x_{n})|0\le x_{i}<b_{ii}\}. \end{aligned}$$

(7.130)

This is a cube with a volume of d(L). Thus

$$\begin{aligned} \mathbb {Z}^{n}/L=F(B^{*})\cap \mathbb {Z}^{n}=\{(x_{1},x_{2},\ldots ,x_{n})|0\le x_{i}<b_{ii},x_{i}\in \mathbb {Z}\}. \end{aligned}$$

(7.131)

This is another proof of Lemma 7.24.

$\alpha {{\,\mathrm{mod}\,}}L$ is called the reduction vector of $\alpha $ under module L, for any $\alpha \in \mathbb {Z}^{n}$, express that the number of bits of the reduction vector $\alpha {{\,\mathrm{mod}\,}}L$ is

$$\begin{aligned} \sum _{i=1}^{n}\log b_{ii}=\log \prod (b_{ii})=\log d(L). \end{aligned}$$

(7.132)

To sum up, the parallelogram of the HNF basis of L has a particularly simple geometry, which is actually a cube, which is very helpful for calculating the reduction vector $x{{\,\mathrm{mod}\,}}L$ of an entire point $x\in \mathbb {Z}^{n}$, the reduction vector is of great significance in the further improvement and analysis of GGH/HNF cryptosystem. For detailed work, please refer to D. Micciancio’s paper (Micciancio, 2001) in 2001.

7.7 NTRU Cryptosystem

NTRU cryptosystem is a new public key cryptosystem proposed in 1996 by the number theory research unit (NTRU) composed of three digit theorists J. Hoffstein, J. Piper and J. Silverman of Brown University in the USA. Its main feature is that the key generation is very simple, and the encryption and decryption algorithms are much faster than the commonly used RSA and elliptic curve cryptography, NTRU, in particular, can resist quantum computing attacks and is considered to be a potential public key cryptography that can replace RSA in the postquantum cryptography era.

The essence of NTRU cryptographic design is the generalization of RSA on polynomials, so it is called the cryptosystem based on polynomial rings. However, NTRU can give a completely equivalent form by using the concept of q-ary lattice, so NTRU is also a lattice based cryptosystem. For simplicity, we start with polynomial rings.

Let $\mathbb {Z}[x]$ be a polynomial ring with integral coefficients and $N\ge 1$ be a positive integer. We define the polynomial quotient ring R as

$$R=\mathbb {Z}[x]/\langle x^{N}-1\rangle =\{a_{0}+a_{1}x+\cdots +a_{N-1}x^{N-1}|a_{i}\in \mathbb {Z}\}.$$

Any $F(x)\in R$, F(x) can be written as an entire vector,

$$\begin{aligned} F(x)=\sum _{i=0}^{N-1}F_{i}x^{i}=(F_{0},F_{1},\ldots ,F_{N-1})\in \mathbb {Z}^{N}. \end{aligned}$$

(7.133)

In R, we define a new operation $\otimes $ called the convolution of two polynomials. Let

$$F(x)=\sum _{i=0}^{N-1}F_{i}x^{i},G(x)=\sum _{i=0}^{N-1}G_{i}x^{i}.$$

Define

$$F\otimes G=H(x)=\sum _{i=0}^{N-1}H_{i}x^{i}=(H_{0},H_{1},\ldots ,H_{N-1}).$$

For any $k,0\le k\le N-1$,

$$\begin{aligned} \begin{aligned} H_{k}&=\sum _{i=0}^{k}F_{i}G_{k-i}+\sum _{i=k+1}^{N-1}F_{i}G_{N+k-i}\\&=\sum _{\begin{array}{c} 0\le i<N\\ 0\le j<N\\ i+j\equiv k({{\,\mathrm{mod}\,}}N) \end{array}}F_{i}G_{j}. \end{aligned} \end{aligned}$$

(7.134)

Lemma 7.46

Under the new multiplication, R is a commutative ring with unit elements.

Proof

By (7.134),

$$F\otimes G=G\otimes F,F\otimes (G+H)=F\otimes G+F\otimes H.$$

So R forms a commutative ring under $\otimes $.

If $a\in \mathbb {Z},0\le a\le N-1$, is a constant polynomial in R, then

$$a\otimes F=aF=(aF_{0},aF_{1},\ldots ,aF_{N-1}).$$

Therefore, R has the unit element $a=1$. The Lemma holds..

Let $F(x)=(F_{0},F_{1},\ldots ,F_{N-1})\in R$. Define

$$\begin{aligned} \tilde{F}=\frac{1}{N}\sum _{i=0}^{N-1}F_{i}, \text {is arithmetic mean of the coefficients of}~F. \end{aligned}$$

(7.135)

The $L^{2}$ norm (European norm) and $L^{\infty }$ norm of F are defined as

$$\begin{aligned} {\left\{ \begin{array}{ll} \ |F|_{2}=(\sum _{i=0}^{N-1}(F_{i}-\tilde{F})^{2})^{\frac{1}{2}} \\ \ |F|_{\infty }=\max _{0\le i\le N-1}F_{i}-\min _{0\le i\le N-1}F_{i}. \end{array}\right. } \end{aligned}$$

(7.136)

Definition 7.11

Let $d_{1},d_{2}$ be two positive integers, and $d_{1}+d_{2}\le N$, define polynomial set $A(d_{1},d_{2})$ as

$$\begin{aligned} A(d_{1},d_{2})&=\{F\in R|F ~\text {has} ~d_{1}~\text {coefficients of }~1, d_{2} \; \text {coefficients of }-1, \nonumber \\&\qquad \text {other coefficients are}~0\}. \end{aligned}$$

(7.137)

Lemma 7.47

Let $1\le d< [\frac{N}{2}]$,

(i)

Suppose $F\in A(d,d-1)$, then
$$|F|_{2}=\sqrt{2d-1-\frac{1}{N}}.$$
(ii)

If $F\in A(d,d)$, then
$$|F|_{2}=\sqrt{2d}.$$

Proof

If $F\in A(d,d-1)$, by (7.135), then $\tilde{F}=\frac{1}{N}$, thus

$$\begin{aligned} \begin{aligned} (|F|_{2})^{2}&=\sum _{i=0}^{N-1}\left( F_{i}-\frac{1}{N}\right) ^{2}\\&=\sum _{i=0}^{N-1}\left( F_{i}^{2}-\frac{2}{N}F_{i}+\frac{1}{N^{2}}\right) \\&=2d-1-\frac{2}{N}+\frac{1}{N}=2d-1-\frac{1}{N}, \end{aligned} \end{aligned}$$

so (i) holds. If $F\in A(d,d)$, then $\tilde{F}=0$, thus

$$(|F|_{2})^{2}=2d, \Rightarrow |F|_{2}=\sqrt{2d}.$$

The Lemma holds.

The parameters of NTRU cryptosystem are three positive integers, N, q, p, where $1\le p<q$, and $(p,q)=1$, that is

$$\text {parameter system}=\{(N,q,p)|1\le p<q, \text {and}~(p,q)=1\}.$$

When the parameter (N, q, p) is selected, we will discuss the key generation of NTRU.

Key generation. Each NTRU user selects two polynomials $f\in R,g\in R, \deg f=\deg g=N-1$, as private key. Take $f=(f_{0},f_{1},\ldots , f_{N-1}), g=(g_{0},g_{1}, \ldots , g_{N-1})$ as the row vector, then $(f,g)\in \mathbb {Z}^{2N}\subset R^{2N}$. Where $f{{\,\mathrm{mod}\,}}q$ is reversible as a polynomial on $\mathbb {Z}_{q}$ and $f{{\,\mathrm{mod}\,}}p$ is reversible as a polynomial on $\mathbb {Z}_{p}$, that is $\exists ~ F_{q}\in \mathbb {Z}_{q}[x], F_{p}\in \mathbb {Z}_{p}[x]$ such that

$$\begin{aligned} F_{q}\otimes f\equiv 1({{\,\mathrm{mod}\,}}q), \text {and} ~F_{p}\otimes f\equiv 1({{\,\mathrm{mod}\,}}p). \end{aligned}$$

(7.138)

When the private key (f, g) is selected, the public key h is given by the following formula:

$$\begin{aligned} h\equiv F_{q}\otimes g({{\,\mathrm{mod}\,}}q). \end{aligned}$$

(7.139)

h can be regarded as a polynomial on $\mathbb {Z}_{q}$. Quotient rings $\mathbb {Z}_{q}$ and $\mathbb {Z}_{p}$ are

$$\mathbb {Z}_{q}=\mathbb {Z}/q\mathbb {Z}=\left\{ a\in \mathbb {Z}|-\frac{q}{2}\le a<\frac{q}{2}\right\} . $$

$$\mathbb {Z}_{p}=\mathbb {Z}/p\mathbb {Z}=\left\{ a\in \mathbb {Z}|-\frac{p}{2}\le a<\frac{p}{2}\right\} .$$

Encryption transformation. User B wants to use NTRU to send encrypted information m to user A. First, the plaintext m is encoded as $m\in R$, that is $m\in \mathbb {Z}^{N}$, then take the value under ${{\,\mathrm{mod}\,}}p$, that is

$$m\in \mathbb {Z}_{p}^{N}.$$

Then select a polynomial $\phi \in R, \deg \phi =N-1$ at random, then use the public key h of user A for encryption. The encryption function $\sigma $ is

$$\begin{aligned} \sigma (m)=c\equiv p\phi \otimes h+m({{\,\mathrm{mod}\,}}q). \end{aligned}$$

(7.140)

c is the cryptosystem text received by user A, c is a polynomial on $\mathbb {Z}_q$ and a vector in $\mathbb {Z}_{q}^{N}$.

Decryption transformation. After receiving cryptosystem text c, user A decrypts it with its own private keys f and $F_{p}$ and first calculates

$$\begin{aligned} a\equiv f\otimes c({{\,\mathrm{mod}\,}}q). \end{aligned}$$

(7.141)

a as a polynomial on $\mathbb {Z}_{q}$, that is, $a\in \mathbb {Z}_{q}^{N}$ is unique. Finally, the decryption transform $\sigma ^{-1}$ is

$$\begin{aligned} \sigma ^{-1}(c)\equiv a\otimes F_{p}({{\,\mathrm{mod}\,}}p). \end{aligned}$$

(7.142)

Why is the decryption transformation correct? If the parameter selection meets

$$\begin{aligned} p\phi \otimes h+m\in \mathbb {Z}_{q}^{N}. \end{aligned}$$

(7.143)

Then

$$\begin{aligned} c=p\phi \otimes h+m. \end{aligned}$$

(7.144)

Similarly, if $c\otimes f\in \mathbb {Z}_{q}^{N}$, then $a=f\otimes c$. By (7.142),

$$a\otimes F_{p}=F_{p}\otimes f\otimes c\equiv c=p\phi \otimes h+m({{\,\mathrm{mod}\,}}p).$$

Thus

$$a\otimes F_{p}\equiv m({{\,\mathrm{mod}\,}}p).$$

Because $m\in \mathbb {Z}_{q}^{N}$, so

$$\sigma ^{-1}(c)\equiv a\otimes F_{p}\equiv m({{\,\mathrm{mod}\,}}p), \Rightarrow \sigma ^{-1}(c)=m.$$

Therefore, the decryption transformation is correct under the conditions of (7.143) and $c\otimes f\in \mathbb {Z}_{q}^{N}$.

NTRU’s encryption and decryption transformation cannot guarantee the correct decryption of $100\%$. Because a is taken out as a polynomial under ${{\,\mathrm{mod}\,}}q$ for decryption operation (see (7.142)). To satisfy (7.144), and $c\otimes f\in \mathbb {Z}_{q}^{N}$, then the following formula is necessary,

$$\begin{aligned} |f\otimes c|_{\infty }=|f\otimes (p\phi \otimes h+m)|_{\infty }<q. \end{aligned}$$

(7.145)

Therefore, as a necessary condition, when the following formula holds, (7.145) holds.

$$\begin{aligned} |f\otimes m|_{\infty }\le \frac{q}{4}, \text {and}~|p\phi \otimes g|_{\infty }\le \frac{q}{4}. \end{aligned}$$

(7.146)

Lemma 7.48

For any $\varepsilon >0$, there are constants $r_{1}$ and $r_{2}>0$, depending only on $\varepsilon $ and N, for randomly selected polynomial $F, G\in R$, then the probability of satisfying the following formula is $\ge 1-\varepsilon $, that is

$$P\{r_{1}|F|_{2}|G|_{2}\le |F\otimes G|_{\infty }\le r_{2}|F|_{2}|G|_{2}\}\ge 1-\varepsilon .$$

Proof

See reference Hoffstein et al. (1998) in this chapter.

By Lemma, to satisfy (7.146), we choose three parameters $d_{f}, d_{g}$ and d, where

$$\begin{aligned} f\in A(d_{f},d_{f}-1),g\in A(d_{g},d_{g}), \phi \in A(d,d). \end{aligned}$$

(7.147)

By Lemma 7.47, $|f|_{2}, |g|_{2}$ and $|\phi |_{2}$ are known, we choose

$$\begin{aligned} |f|_{2}\cdot |m|_{2}\approx \frac{q}{4r_{2}},|\phi |_{2}\cdot |g|_{2}\approx \frac{q}{4pr_{2}}. \end{aligned}$$

(7.148)

Then, Eq. (7.146) can be guaranteed to be true (in the sense of probability), so that the success rate of the decryption algorithm will be greater than $1-\varepsilon $. Thus, (7.148) becomes the main parameter selection index of NTRU.

Next, we use the concept of q-element lattice to make an equivalent description of the above NTRU. We first discuss it from the cyclic matrix. Let T and $T_{1}$ be the following two N-order square matrices.

$$\begin{aligned} T=\left( \begin{array}{cccc}0 &{} \cdots &{} 0 &{} 1 \\ &{} &{} &{} 0 \\ &{} I_{n-1} &{} &{} \vdots \\ &{} &{} &{} 0\end{array}\right) ,~~~ T_{1}=\left( \begin{array}{cccc}0 &{} &{} &{} \\ 0 &{} &{}I_{n-1} &{} \\ \vdots &{} &{} &{} \\ 1&{} 0&{}0 &{} 0\end{array}\right) . \end{aligned}$$

Then $T^{N}=T_{1}^{N}=I_{N},T_{1}=T'$, and $T_{1}=T^{-1}$, because T is an orthogonal matrix $\Rightarrow T_{1}=T^{N-1}$, where $I_{N}$ is the N-th order identity matrix, let $a=(a_{1},a_{2},\ldots ,a_{N})\in \mathbb {R}^{N}$, it is easy to verify

$$\begin{aligned} T\cdot \left[ \begin{array}{cccc} a_{1} \\ a_{2} \\ a_{3} \\ \vdots \\ a_{N} \end{array}\right] = \left[ \begin{array}{cccc} a_{N} \\ a_{1} \\ a_{2} \\ \vdots \\ a_{N-1} \end{array}\right] , ~(a_{1},a_{2},a_{3},\ldots ,a_{N})T_{1}=(a_{N},a_{1},a_{2},\ldots ,a_{N-1}). \end{aligned}$$

(7.149)

The following general assumptions $a= \left[ \begin{array}{cccc} a_{1} \\ a_{2} \\ \vdots \\ a_{N} \end{array}\right] \in \mathbb {R}^{N}$ are the column vector. The N-order cyclic matrix $T^{*}(a)$ generated by a is defined as

$$\begin{aligned} T^{*}(a)=[a,Ta,T^{2}a,\ldots ,T^{N-1}a]. \end{aligned}$$

(7.150)

If $b=(b_{1},b_{2},\ldots ,b_{N})\in \mathbb {R}^{N}$ is a row vector, we define an N-order matrix

$$\begin{aligned} T_{1}^{*}(b)= \left[ \begin{array}{cccc} b \\ bT_{1} \\ \vdots \\ bT_{1}^{N-1} \end{array}\right] . \end{aligned}$$

(7.151)

In order to distinguish in the mathematical formula, $T^{*}(a)$ and $T_{1}^{*}(a)$ are sometimes written as $T^{*}a$ and $T_{1}^{*}a$ or $[T^{*}a]$ and $[T_{1}^{*}a]$. Obviously, the transpose of $T^{*}(a)$ is

$$\begin{aligned} (T^{*}(a))'= \left[ \begin{array}{cccc} a' \\ a'T_{1} \\ \vdots \\ a'T_{1}^{N-1} \end{array}\right] =T_{1}^{*}(a'). \end{aligned}$$

(7.152)

Equation (7.150) is column vector blocking of cyclic matrix, in order to obtain row vector blocking of cyclic matrix. For any $x\in (x_{1},\ldots ,x_{N})\in \mathbb {R}^{N}$, we let

$$\overline{x}=(x_{N},x_{N-1},\ldots ,x_{2},x_{1})\Rightarrow \overline{\overline{x}}=x.$$

Similarly, define column vectors $\overline{x}$. So for any column vector $a\in \mathbb {R}^{N}$, we have

$$\begin{aligned} T^{*}(a)= [a,Ta,T^{2}a,\ldots ,T^{N-1}a ]= \left[ \begin{array}{cccc} \overline{a'}T_{1} \\ \overline{a'}T_{1}^{2} \\ \vdots \\ \overline{a'}T_{1}^{N} \end{array}\right] . \end{aligned}$$

(7.153)

On the right side of (7.153) is a cyclic matrix, which is partitioned by rows. We first prove that the transpose of the cyclic matrix is still a cyclic matrix.

Lemma 7.49

$\forall ~ a= \left[ \begin{array}{cccc} \alpha _{1} \\ \alpha _{2} \\ \vdots \\ \alpha _{N} \end{array}\right] \in \mathbb {R}^{N}$, then $(T^{*}(a))'=T^{*}(\overline{T^{-1}a})$.

Proof

Let $\alpha = \left[ \begin{array}{cccc} \alpha _{1} \\ \alpha _{2} \\ \vdots \\ \alpha _{N} \end{array}\right] \in \mathbb {R}^{N}$, by (7.152), $(T^{*}(a))'=T_{1}^{*}(a')$, where $\alpha '=(\alpha _{1},\ldots ,\alpha _{N})$ is the transpose of $\alpha $, let

$$\beta =(\alpha _{1},\alpha _{N},\alpha _{N-1},\ldots ,\alpha _{2})=\overline{\alpha '}T_{1}.$$

Easy to verify

$$T_{1}^{*}(\beta )= \left[ \begin{array}{cccc} \beta \\ \beta T_{1} \\ \vdots \\ \beta T_{1}^{N-1} \end{array}\right] =T^{*}(\alpha ).$$

There is

$$T_{1}^{*}(\beta )=(T^{*}(\beta '))'=T^{*}(\alpha ).$$

Because $\overline{\alpha '}=(\alpha _{N},\alpha _{N-1},\cdots ,\alpha _{2},\alpha _{1})$, and $\beta =\overline{\alpha '}T_{1}$, so

$$\beta '=T\overline{\alpha }\Rightarrow T^{-1}\beta '=\overline{\alpha }\Rightarrow \alpha =\overline{T^{-1}\beta '}.$$

We let $a=\beta '$, then

$$(T^{*}(\alpha ))'=T^{*}(\alpha )=T^{*}(\overline{T^{-1}\alpha }).$$

We have completed the proof of Lemma.

Next, we give an equivalent characterization of cyclic matrix.

Lemma 7.50

Let $A=(a_{ij})_{N\times N},a= \left[ \begin{array}{cccc} a_{11} \\ a_{21} \\ \vdots \\ a_{N1} \end{array}\right] \in \mathbb {R}^{N}$ is the first column of A, then $A=T^{*}(a)$ is a cyclic matrix if and only if for all $1\le k\le N$, if $1+i-j\equiv k({{\,\mathrm{mod}\,}}N)$, then $a_{ij}=a_{k1}$.

Proof

If $A=T^{*}(a)$ is a cyclic matrix, by simple observation, there is

$$ {\left\{ \begin{array}{ll} \ a_{11}=a_{22}=\cdots =a_{NN}=a_{11} \\ \ a_{21}=a_{32}=\cdots =a_{NN-1}=a_{21} \\ \ \vdots \\ \ a_{(N-1)1}=a_{N2} \\ \ a_{N1}=a_{N1} \end{array}\right. }.$$

Thus, $1+i-j=k$. The same applies to $i<j$. We have

$$k=N+1+i-j\Rightarrow 1+i-j\equiv k({{\,\mathrm{mod}\,}}N).$$

So the Lemma holds.

The following lemma characterizes the main properties of cyclic matrices.

Lemma 7.51

If $a= \left[ \begin{array}{cccc} a_{1} \\ \vdots \\ a_{N} \end{array}\right] , b= \left[ \begin{array}{cccc} b_{1} \\ \vdots \\ b_{N} \end{array}\right] $ are two column vectors, then

(i)

$T^{*}(a)+T^{*}(b)=T^{*}(a+b).$
(ii)

$T^{*}(a)\cdot T^{*}(b)=T^{*}([T^{*}a]\cdot b)$, and $T^{*}(a)T^{*}(b)=T^{*}(b)T^{*}(a).$
(iii)

$\det (T^{*}(a))=\prod _{k=1}^{N}(a_{1}+a_{2}\xi _{k}+\cdots +a_{N}\xi _{k}^{N-1}).$ Where $\xi _{k}(1 \le k \le N)$ is the root of all N-th units of 1.
(iv)

If the cyclic matrix $T^{*}(a)$ is reversible, the inverse matrix is $(T^{*}(a))^{-1}=T^{*}(b),$ Where b is the first column of $T^{*}(a)$.

Proof

(i) is trivial, because

$$T^{*}(a)+T^{*}(b)=[a+b,T(a+b),\ldots ,T^{N-1}(a+b)]=T^{*}(a+b).$$

To prove (ii), using the row vector block of cyclic matrix (see (7.153)), then

$$[T^{*}(a)]b= \left[ \begin{array}{cccc} \overline{a'}T_{1} \\ \overline{a'}T_{1}^{2} \\ \vdots \\ \overline{a'}T_{1}^{N} \end{array}\right] b= \left[ \begin{array}{cccc} \overline{a'}T_{1}b \\ \overline{a'}T_{1}^{2}b \\ \vdots \\ \overline{a'}T_{1}^{N}b \end{array}\right] ,$$

and

$$T^{*}(a)\cdot T^{*}(b)= \left[ \begin{array}{cccc} \overline{a'}T_{1} \\ \overline{a'}T_{1}^{2} \\ \vdots \\ \overline{a'}T_{1}^{N} \end{array}\right] [b,Tb,\ldots ,T^{N-1}b]=(A_{ij})_{N\times N}.$$

where

$$A_{ij}=\overline{a'}T_{1}^{i}\cdot T^{j-1}b=\overline{a'}T_{1}^{N+i-j+1}b=\overline{a'}T_{1}^{i+1-j}b.$$

By Lemma 7.50, then $T^{*}(a)\cdot T^{*}(b)=T^{*}([T^{*}(a)]b)$, so there is the first conclusion of (ii). We notice that

$$A_{ij}=A_{ij}'=b'T_{1}^{j-1}T^{i}\overline{a}=b'T_{1}^{N-i-1+j}\overline{a}=b'T_{1}^{j-i-1}\overline{a}.$$

It is easy to prove that for any row vector x and column vector y, there is $x\cdot \overline{y}=\overline{x}\cdot y$, and

$$\begin{aligned} \overline{xT_{1}^{k}}=\overline{x}\cdot T_{1}^{N-k},1\le k\le N. \end{aligned}$$

(7.154)

Thus,

$$A_{ij}=b'T_{1}^{j-i-1}\overline{a}=\overline{b'}T_{1}^{N+i+1-j}a=\overline{b'}T_{1}^{i+1-j}a.$$

This proves that $T^{*}(a)T^{*}(b)=T^{*}(b)T^{*}(a)$; that is, the multiplication of cyclic matrix to matrix is commutative.

To prove (iii), suppose $(T^{*}(a))'=A$, but $\det (T^{*}(a))=\det ((T^{*}(a))')$, so we just need to calculate $\det (A)$. Make polynomial $f(x)=a_{1}+a_{2}x+\cdots +a_{N}x^{N-1}$, and let

$$\begin{aligned} V= \left[ \begin{array}{cccccc} 1 &{}1 &{}1 &{}\cdots &{} 1\\ \xi _{1} &{}\xi _{2} &{}\xi _{3} &{}\cdots &{} \xi _{N} \\ \xi _{1}^2 &{}\xi _{2}^2 &{}\xi _{3}^2 &{}\cdots &{} \xi _{N}^2 \\ \cdots &{}\cdots &{}\cdots &{}\cdots &{}\cdots \\ \xi _{1}^{N-1} &{}\xi _{2}^{N-1} &{}\xi _{3}^{N-1} &{}\cdots &{}\xi _{N}^{N-1} \\ \end{array}\right] . \end{aligned}$$

Then

$$\begin{aligned} AV= \left[ \begin{array}{ccccc} f(\xi _{1}) &{} f(\xi _{2}) &{} \cdots &{} f(\xi _{N})\\ \xi _{1}f(\xi _{1}) &{}\xi _{2} f(\xi _{2})&{} \cdots &{} \xi _{N}f(\xi _{N}) \\ \cdots &{}\cdots &{}\cdots &{}\cdots \\ \xi _{1}^{N-1}f(\xi _{1}) &{}\xi _{2}^{N-1} f(\xi _{2}) &{}\cdots &{}\xi _{N}^{N-1}f(\xi _{N}) \\ \end{array}\right] . \end{aligned}$$

$$\det (A)\det (V)=\det (AV)=f(\xi _{1}) f(\xi _{2}) \cdots f(\xi _{N})\det (V).$$

Because $\xi _{i}$ is different from each other, that is $\det (V)\ne 0$, so

$$\begin{aligned} \begin{aligned} \det (A)&=f(\xi _{1}) f(\xi _{2}) \cdots f(\xi _{N})\\&=\prod _{k=1}^{N}f(\xi _{k})\\&=\prod _{k=1}^{N}(a_{1}+a_{2}\xi _{k}+\cdots +a_{N}\xi _{k}^{N-1}). \end{aligned} \end{aligned}$$

Now prove (iv). Let $e= \left[ \begin{array}{cccc} 1 \\ 0 \\ \vdots \\ 0 \\ \end{array}\right] \in \mathbb {R}^{N}$, then

$$T^{*}(e)=[e,Te,\ldots ,T^{N-1}e]=I_{N}.$$

So take $b\in \mathbb {R}^{N}$ to satisfy

$$T^{*}(a)\cdot b=e\Rightarrow b=(T^{*}(a))^{-1}e.$$

Obviously, b is the first column of $(T^{*}(a))^{-1}$, by (ii),

$$T^{*}(a)T^{*}(b)=T^{*}([T^{*}(a)]b)=T^{*}(e)=I_{N}.$$

Thus, $(T^{*}(a))^{-1}=T^{*}(b)$. In other words, the inverse of a reversible cyclic matrix is also a cyclic matrix.

Corollary 7.10

Let N be a prime, $a= \left[ \begin{array}{cccc} a_{1} \\ \vdots \\ a_{N} \\ \end{array}\right] \in \mathbb {R}^{n}$, satisfy $a\ne 1$, and $\sum _{i=1}^{N}a_{i}\ne 0$, then the cyclic matrix $T^{*}(a)$ generated by a is a reversible square matrix.

Proof

Under given conditions, we can only prove $\det (T^{*}(a))\ne 0$. Let $\varepsilon _{k}=\exp (\frac{2\pi ik}{N})$, $1\le k\le N-1$, be $N-1$ primitive unit roots of N-th( because N is a prime), if $\det (T^{*}(a))=0$, because of $\sum _{i=1}^{N}a_{i}\ne 0$, there must be a $k,1\le k\le N-1$, such that

$$a_{1}+\varepsilon _{k}a_{2}+\varepsilon _{k}^{2}a_{3}+\cdots +\varepsilon _{k}^{N-1}a_{N}=0.$$

In other words, $\varepsilon _{k}$ is a root of polynomial $\phi (x)=a_{1}+a_{2}x+\cdots +a_{N}x^{N-1}$, so $\phi (x)$ and $1+x+\cdots +x^{N-1}$ have a common root $\varepsilon _{k}$, therefore, the greatest common divisor of two polynomials

$$(\phi (x),1+x+\cdots +x^{N-1})>1.$$

Since $1+x+\cdots +x^{N-1}$ is a circular polynomial, it is an irreducible polynomial, $a\ne 1$, contradiction shows $\det (T^{*}(a))\ne 0$, the Corollary holds.

Next, we give an equivalent description of a lattice of NTRU by using the cyclic matrix. Firstly, we define the linear transformation $\sigma $ in the even dimensional Euclidean space $\mathbb {R}^{2N}$, if x and y are two column vectors, define

$$\begin{aligned} \sigma \left[ \begin{array}{cccc} x \\ y \\ \end{array}\right] = \left[ \begin{array}{cccc} Tx \\ Ty \\ \end{array}\right] \in \mathbb {R}^{2N}. \end{aligned}$$

(7.155)

Equivalently, if $x\in \mathbb {R}^{N}, y\in \mathbb {R}^{N}$ are two row vectors, define

$$\begin{aligned} \sigma (x,y)=(xT_{1},yT_{1})\in \mathbb {R}^{2N}. \end{aligned}$$

(7.156)

Obviously, $\sigma $ defined above is a linear transformation of $\mathbb {R}^{2N}\rightarrow \mathbb {R}^{2N}$.

Definition 7.12

An entire lattice $L\subset \mathbb {R}^{2N}$ is called a convolution q-ary lattice, if

(i)

L is q-ary lattice, that is $q\mathbb {Z}^{2N}\subset L\subset \mathbb {Z}^{2N}$.
(ii)

L is closed under the linear transformation $\sigma $, that is, $x,y\in \mathbb {R}^{N}$ is the column vector,
$$ \left[ \begin{array}{cccc} x \\ y \\ \end{array}\right] \in L\Rightarrow \sigma \left[ \begin{array}{cccc} x \\ y \\ \end{array}\right] = \left[ \begin{array}{cccc} Tx \\ Ty \\ \end{array}\right] \in L.$$

Recall that NTRU’s private key is two $N-1$-degree polynomials $f=\sum _{i=0}^{N-1}f_{i}x^{i}, g=\sum _{i=0}^{N-1}g_{i}x^{i}$, and write f and g in column vector form:

$$f= \left[ \begin{array}{cccc} f_{0} \\ \vdots \\ f_{N-1} \\ \end{array}\right] \in \mathbb {Z}^{N},f'=(f_{0},f_{1},\ldots ,f_{N-1})\in \mathbb {Z}^{N}.$$

And

$$g= \left[ \begin{array}{cccc} g_{0} \\ \vdots \\ g_{N-1} \\ \end{array}\right] \in \mathbb {Z}^{N},g'=(g_{0},g_{1},\ldots ,g_{N-1})\in \mathbb {Z}^{N}.$$

NTRU’s parameter system is N, q, p is two positive integers, N is prime, $p<q$, and defines a polynomial set

$$\begin{aligned} A_{d}\{p,0,-p\}&=\{f(x)\in \mathbb {Z}^{N}|d+1 ~\text {coefficients of}~f~\text {are}~p, \nonumber \\&\qquad d~\text {coefficients of}~f~\text {are}~p,~\text {others are}~0\}. \end{aligned}$$

(7.157)

Select two polynomials $f, g\in \mathbb {Z}^{N}$ of degree $N-1$, and parameter $d_{f}$ are positive integers, which meet the following restrictions.

(A)

$N,p,q,d_{f}$ are positive integers, N is a prime, $1<p<q,(p,q)=1$;
(B)

f and g are two polynomials of degree $N-1$, and the constant term of f is 1, and
$$f-1\in A_{d_{f}}\{p,0,-p\}, g\in A_{d_{f}}\{p,0,-p\}.$$
(C)

$T^{*}(f)$ is reversible ${{\,\mathrm{mod}\,}}q$.

The above (A)–(C) are the parameter constraints of NTRU. Obviously, under these conditions, $T^{*}(f)$ and $T^{*}(g)$ are reversible matrices, and

$$\begin{aligned} T^{*}(f)\equiv I_{N}({{\,\mathrm{mod}\,}}p),T^{*}(g)\equiv 0({{\,\mathrm{mod}\,}}p). \end{aligned}$$

(7.158)

After the polynomials f and g satisfying the above conditions are selected as the private key, then $ \left[ \begin{array}{cccc} f \\ g \\ \end{array}\right] \in \mathbb {Z}^{2N}$, let’s construct a minimum convolution q-ary lattice containing $ \left[ \begin{array}{cccc} f \\ g \\ \end{array}\right] $. Suppose

$$\begin{aligned} A=[T_{1}^{*}(f'),T_{1}^{*}(g')]_{N\times 2N},~ \text {and}~A'= \left[ \begin{array}{cccc} T^{*}(f) \\ T^{*}(g) \\ \end{array}\right] . \end{aligned}$$

(7.159)

Consider A as an $N\times 2N$-order matrix on $\mathbb {Z}_{q}$, that is $A\in \mathbb {Z}_{q}^{N\times 2N}$, then by (7.45), A defines a 2N dimensional q-ary lattice $\Lambda _{q}(A)$, that is

$$\begin{aligned} \Lambda _{q}(A)=\{y\in \mathbb {Z}^{2N}|~\text {there is}~x\in \mathbb {Z}^{N}\Rightarrow y\equiv A'x({{\,\mathrm{mod}\,}}q)\}. \end{aligned}$$

(7.160)

We prove that $\Lambda _{q}(A)$ is a convolution q-ary lattice containing $ \left[ \begin{array}{cccc} f \\ g \\ \end{array}\right] $. First, we prove the following general identity

Lemma 7.52

Suppose $a= \left[ \begin{array}{cccc} a_{1} \\ \vdots \\ a_{N} \\ \end{array}\right] \in \mathbb {R}^{N}$, then for $\forall ~ x\in \mathbb {R}^{N}$ and $0\le k\le N-1$, we have

$$T^{k}(T^{*}(a)x)=T^{*}(a)(T^{k}x), \mathrm{where} ~T^{0}=I_{N}.$$

Proof

$k=0$ is trivial, obviously, we can assume $k=1$, that is

$$\begin{aligned} T(T^{*}(a)x)=T^{*}(a)(Tx). \end{aligned}$$

(7.161)

By (7.153),

$$T(T^{*}(a)x)=T \left[ \begin{array}{cccc} \overline{a'}T_{1}x \\ \overline{a'}T_{1}^{2}x \\ \vdots \\ \overline{a'}T_{1}^{N}x \\ \end{array}\right] = \left[ \begin{array}{cccc} \overline{a'}x \\ \overline{a'}T_{1}x \\ \vdots \\ \overline{a'}T_{1}^{N-1}x \\ \end{array}\right] .$$

Because of $T=T_{1}^{N-1}$, then the right side of Eq. (7.161) is

$$T^{*}(a)(Tx)= \left[ \begin{array}{cccc} \overline{a'}T_{1}Tx \\ \overline{a'}T_{1}^{2}Tx \\ \vdots \\ \overline{a'}T_{1}^{N}Tx \\ \end{array}\right] = \left[ \begin{array}{cccc} \overline{a'}x \\ \overline{a'}T_{1}x \\ \vdots \\ \overline{a'}T_{1}^{N-1}x \\ \end{array}\right] .$$

So (7.161) holds, the Lemma holds.

Lemma 7.53

$\Lambda _{q}(A)$ is a convolution q-ary lattice, and $ \left[ \begin{array}{cccc} f \\ g \\ \end{array}\right] \in \Lambda _{q}(A)$.

Proof

By Lemma 7.27, $\Lambda _{q}(A)$ is a q-ary lattice, that is $q\mathbb {Z}^{2N}\subset \Lambda _{q}(A)\subset \mathbb {Z}^{2N}$, we only prove $\Lambda _{q}(A)$ is closed under linear transformation $\sigma $. If $y\in \Lambda _{q}(A)$, then there is $x\in \mathbb {Z}^{N}\Rightarrow y\equiv A'x({{\,\mathrm{mod}\,}}q)$, by the definition of $\sigma $,

$$\sigma (y)\equiv \left[ \begin{array}{cccc} T(T^{*}(f)x) \\ T(T^{*}(g)x) \\ \end{array}\right] = \left[ \begin{array}{cccc} T^{*}(f)Tx \\ T^{*}(g)Tx \\ \end{array}\right] \equiv A'Tx({{\,\mathrm{mod}\,}}q).$$

Because of $x\in \mathbb {Z}^{N}\Rightarrow Tx\in \mathbb {Z}^{N}$, thus $\sigma (y)\in \Lambda _{q}(A)$. That is, $\Lambda _{q}(A)$ is a convolution q-ary lattice, which is proved $ \left[ \begin{array}{cccc} f \\ g \\ \end{array}\right] \in \Lambda _{q}(A)$. Let $e= \left[ \begin{array}{cccc} 1 \\ 0 \\ \vdots \\ 0 \\ \end{array}\right] \in \mathbb {Z}^{N}$, then $T^{*}(f)\cdot e$ is the first column of $T^{*}(f)$, that is

$$T^{*}(f)e=f, T^{*}(g)e=g.$$

Thus,

$$A'e= \left[ \begin{array}{cccc} T^{*}(f)e \\ T^{*}(g)e \\ \end{array}\right] = \left[ \begin{array}{cccc} f \\ g \\ \end{array}\right] \in \Lambda _{q}(A).$$

The Lemma holds.

With the above preparation, we now introduce the equivalent form of NTRU in lattice theory.

Public key generation. After selected private key $ \left[ \begin{array}{cccc} f \\ g \\ \end{array}\right] \in \mathbb {Z}^{2N}$, NTRU’s public key is generated as follows: Because the convolution q-ary lattice $\Lambda _{q}(A)$ containing $ \left[ \begin{array}{cccc} f \\ g \\ \end{array}\right] $ is an entire lattice, $\Lambda _{q}(A)$ has a unique HNF basis H, where

$$\begin{aligned} H= \left[ \begin{array}{cc} I_{N}&{} T^{*}(h) \\ 0 &{} qI_{N} \end{array}\right] , h\equiv [T^{*}(f)]^{-1}g ({{\,\mathrm{mod}\,}}q). \end{aligned}$$

(7.162)

By (7.48) of Lemma 7.28, the determinant $d(\Lambda _{q}(A))$ of $\Lambda _{q}(A)$ is

$$d(\Lambda _{q}(A))=|\det (\Lambda _{q}(A))|=q^{2N-N}=q^{N}.$$

So the diagonal elements of H are $I_{N}$ and $qI_{N}$. By the assumption $T^{*}(f)\in \mathbb {Z}^{N\times N}$, and reversible ${{\,\mathrm{mod}\,}}q$, $[T^{*}(f)]^{-1}$ is the inverse matrix of $T^{*}(f) {{\,\mathrm{mod}\,}}q$, $h\in \mathbb {Z}^{N}$, its component $h_{i}$ is selected between $-\frac{q}{2}$ and $\frac{q}{2}$, that is $-\frac{q}{2}\le h_{i}<\frac{q}{2}$, such an h is the only one that exists. It is not difficult to verify that H is an HNF matrix and the lattice generated by H is $\Lambda _{q}(A)$, so H is the HNF basis of $\Lambda _{q}(A)$. H is published as a public key.

Encryption transformation. The message sender encodes the plaintext as $m\in \mathbb {Z}^{N}$, and randomly select a vector $r\in \mathbb {Z}^{N}$ to satisfy

$$\begin{aligned} m\in A_{d_{f}}\{1,0,-1\}, ~r\in A_{d_{f}}\{1,0,-1\}. \end{aligned}$$

(7.163)

That is, m has $d_{f}+1$ 1, $d_{f}$ $-1$, other components are 0. Then, the plaintext m is encrypted with the public key H of the message recipient:

$$\begin{aligned} c=H \left[ \begin{array}{cccc} m \\ r \\ \end{array}\right] \equiv \left[ \begin{array}{cccc} m+[T^{*}(h)]r \\ 0 \\ \end{array}\right] ({{\,\mathrm{mod}\,}}q). \end{aligned}$$

(7.164)

c is called cryptosystem text, the first N components are $m+[T^{*}(h)]r$, the last N components are 0.

Decryption transformation. If all components of $m+[T^{*}(h)]r$ are between intervals $[-\frac{q}{2},\frac{q}{2})$, the message receiver can determine that the cryptosystem text c is

$$c= \left[ \begin{array}{cccc} m+[T^{*}(h)]r \\ 0 \\ \end{array}\right] .$$

Then decrypt it with its own private key $T^{*}(f)$,

$$\begin{aligned} \begin{aligned}{}[T^{*}(f)]c&\equiv [T^{*}(f)]m+[T^{*}(f)][T^{*}(h)]r({{\,\mathrm{mod}\,}}q)\\&\equiv [T^{*}(f)]m+[T^{*}(g)]r({{\,\mathrm{mod}\,}}q). \end{aligned} \end{aligned}$$

(7.165)

By the definition of h, there is

$$[T^{*}(f)]h\equiv g({{\,\mathrm{mod}\,}}q)\Rightarrow T^{*}([T^{*}(f)]h)\equiv T^{*}(g)({{\,\mathrm{mod}\,}}q).$$

And by Lemma 7.51, there is $T^{*}([T^{*}(f)]h)\equiv T^{*}(f)\cdot T^{*}(h)$, so

$$T^{*}(f)T^{*}(h)\equiv T^{*}(g)({{\,\mathrm{mod}\,}}q).$$

Equation (7.165) holds.

$$\begin{aligned}{}[T^{*}(f)]m+[T^{*}(g)]r\in \left[ -\frac{q}{2},\frac{q}{2} \right] ^{N}. \end{aligned}$$

(7.166)

So do ${{\,\mathrm{mod}\,}}p$ operation on $[T^{*}(f)]m+[T^{*}(g)]r$, and by (7.158), thus

$$\begin{aligned} ([T^{*}(f)]m+[T^{*}(g)]r){{\,\mathrm{mod}\,}}p=I_{N}m+0\cdot r=m. \end{aligned}$$

(7.167)

The correctness of decryption transformation is guaranteed.

In order to ensure that (7.167) holds, it can be seen from the above analysis that the following conditions are necessary.

$$\begin{aligned} {\left\{ \begin{array}{ll} m+[T^{*}(h)]r\in [-{\frac{q}{2}},{\frac{q}{2}}]^{N} \\ {[}T^{*}(f)]m+[T^{*}(g)]r\in [-{\frac{q}{2}},{\frac{q}{2}}]^{N}. \end{array}\right. } \end{aligned}$$

(7.168)

Obviously, the first condition can be derived from the second condition; that is, the (7.168) can be derived from the (7.166). We first prove the following Lemma.

Lemma 7.54

If the parameter meets $d_{f}<\frac{(\frac{q}{4}-1)}{2p}$, then

$$[T^{*}(f)]m+[T^{*}(g)]r\in \left[ -\frac{q}{2},\frac{q}{2} \right] ^{N}.$$

Proof

Because all components of m and r are $\pm 1$ or 0, therefore, we only prove that the absolute value of the row vectors of $[T^{*}(f)]$ and $[T^{*}(g)]$ is not greater than $\frac{q}{2}$. Write $f'=(f_{0},f_{1},\ldots ,f_{N-1})$, because of $f_{0}=1$,

$$\left| \sum _{i=0}^{N-1}f_{i}\right| \le \sum _{i=0}^{N-1}|f_{i}|=1+(2d_{f}+1)p<\frac{q}{4}.$$

Similarly,

$$\left| \sum _{i=0}^{N-1}g_{i}\right| \le \sum _{i=0}^{N-1}|g_{i}|=(2d_{f}+1)p<\frac{q}{4}.$$

Thus

$$[T^{*}(f)]m+[T^{*}(g)]r\in \left[ -\frac{q}{2},\frac{q}{2}\right] ^{N}.$$

The Lemma holds.

According to the above lemma, NTRU algorithm needs to add the following additional conditions to ensure the correctness of decryption transformation:

(D)

$$d_{f}<\frac{(\frac{q}{4}-1)}{2p}.$$

To sum up, when NTRU cryptosystem satisfies the additional restrictions (A)–(D) on the parameter system, the private key is $ \left[ \begin{array}{cccc} f \\ g \\ \end{array}\right] $ and the public key is HNF matrix H, the encryption and decryption algorithm can be based on the algorithm introduced above.

7.8 McEliece/Niederreiter Cryptosystem

McEliece/Niederreiter cryptosystem is a cryptosystem designed based on the asymmetry of coding and decoding of a special class of linear codes (Goppa codes) over a finite field. It was proposed by McEliece and Niederreiter in 1978 and 1985. It is included in the category of postquantum cryptography. We start with cyclic codes. Recall the concept of linear code in Chap. 2, let $\mathbb {F}_{q}$ be a q-element finite field, also known as the alphabet, and the elements in $\mathbb {F}_{q}$ are called letters or characters. The N-dimensional linear space $\mathbb {F}_{q}^{N}$ on $\mathbb {F}_{q}$ is called the codeword space of length N. Any a vector $a=(a_{0},a_{1},\ldots ,a_{N-1})\in \mathbb {F}_{q}^{N}$, a is called a codeword of length N, which is usually written as $a=a_{0}a_{1}\cdots a_{N-1}\in \mathbb {F}_{q}^{N}$, from the previous section, we have

$$\begin{aligned} aT_{1}=(a_{0},a_{1},\ldots ,a_{N-1})T_{1}=(a_{N-1},a_{0},a_{1},\ldots ,a_{N-2}). \end{aligned}$$

(7.169)

The reverse codeword $\overline{a}$ of a codeword $a=a_{0}a_{1}\cdots a_{N-1}$ is defined as

$$\begin{aligned} \overline{a}=a_{N-1}a_{N-2}\cdots a_{1}a_{0}\in \mathbb {F}_{q}^{N}. \end{aligned}$$

(7.170)

If $C\subset \mathbb {F}_{q}^{N}$, and C is a k-dimensional linear subspace of $\mathbb {F}_{q}^{N}$, which is called a linear code, usually written as $C=[N,k], k=0$, or $k=N$, [N, 0] and [N, N] is called trivial code, actually,

$$[N,0]=\{0=00\cdots 0\},[N,N]=\mathbb {F}_{q}^{N}.$$

The reverse order code $\overline{C}$ of code C is defined as $\overline{C}=\{\overline{c}|c\in C\}$, obviously, if $C=[N,k]$, then $\overline{C}=[N,k]$.

Definition 7.13

A linear code C of length N is called a cyclic code, if $\forall ~c\in C\Rightarrow cT_{1}\in C$.

Next, we give an algebraic expression of cyclic codes using ideal theory. For this purpose, note that $\mathbb {F}_{q}[x]$ is a univariate polynomial ring on $\mathbb {F}_{q}$, and $\langle x^{N}-1\rangle $ is the principal ideal generated by polynomial $x^{N}-1$. Write $R=\mathbb {F}_{q}[x]/\langle x^{N}-1 \rangle $ as quotient ring. If $a=a_{0}a_{1}\cdots a_{N-1}\in \mathbb {F}_{q}^{N}$, then $a(x)=a_{0}+a_{1}x\cdots +a_{N-1}x^{N-1}\in R$, so $a\rightarrow a(x)$ is a 1-1 correspondence of $\mathbb {F}_{q}^{N}\rightarrow R$ and an isomorphism between additive groups. In this correspondence, we equate codeword a with polynomial a(x). That is $a=a(x)\Rightarrow \mathbb {F}_{q}^{N}=R=\mathbb {F}_{q}[x]/ \langle x^{N}-1\rangle $, and any code $C\subset \mathbb {F}_{q}^{N}$.

$$C=C(x)=\{c(x)|c\in C\}\subset R.$$

That is, a code C is equivalent to a subset of $\mathbb {F}_{q}[x]/\langle x^{N}-1 \rangle $. The following lemma reveals the algebraic meaning of a cyclic code.

Lemma 7.55

$C\subset \mathbb {F}_{q}^{N}$ is a cyclic code $\Leftrightarrow C(x)$ is an ideal in $\mathbb {F}_{q}[x]/\langle x^{N}-1\rangle $.

Proof

If C(x) is an ideal of $\mathbb {F}_{q}[x]/\langle x^{N}-1\rangle $, obviously C is a linear code, for any code $c=c_{0}c_{1}\cdots c_{N-1}$, there is $c(x)=c_{0}+c_{1}x+\cdots +c_{N-1}x^{N-1}\in C(x)$, thus $xc(x)=c_{N-1}+c_{o}x+c_{1}x^{2}+\cdots +c_{N-2}x^{N-1}\in C(x)$. So $cT_{1}=c_{N-1}c_{0}c_{1}\cdots c_{N-2}\in C$, C is a cyclic code on $\mathbb {F}_{q}$. Conversely, if C is a cyclic code, then $cT_{1}\in C$, thus $cT_{1}^{k}\in C$, for all $0\le k\le N-1$ holds. Where $T_{1}^{0}=I_{N}$ is the N-th order identity matrix. Since the polynomial $cT_{1}^{k}(x)$ corresponding to $cT_{1}^{k}$ is

$$\begin{aligned} cT_{1}^{k}(x)=x^{k}c(x). \end{aligned}$$

(7.171)

So $\forall ~ g(x)\in R\Rightarrow g(x)c(x)\in C(x)$. This proves that C(x) is an ideal. The Lemma holds.

Using the homomorphism theorem of rings, we give the mathematical expressions of all ideals in R. Let $\pi $ be the natural homomorphism of $\mathbb {F}_{q}[x]{\mathop {\longrightarrow }\limits ^{\pi }}\mathbb {F}_{q}[x]/ \langle x^{N}-1 \rangle $, then all ideals in R correspond to all ideals containing $\mathrm{ker}\pi =\langle x^{N}-1 \rangle $ in $\mathbb {F}_{q}[x]$ one by one, that is

$$\mathrm{ker}\pi =\langle x^{N}-1 \rangle \subset A \subset \mathbb {F}_{q}[x]{\mathop {\longrightarrow }\limits ^{\pi }}\mathbb {F}_{q}[x]/\langle x^{N}-1\rangle =R.$$

Since $\mathbb {F}_{q}[x]$ is the principal ideal ring and A is an ideal of $\mathbb {F}_{q}[x]$, and $ \langle x^{N}-1 \rangle \subset A$, then

$$\begin{aligned} A=\langle g(x)\rangle , \mathrm{where}~g(x)|x^{N}-1. \end{aligned}$$

(7.172)

Therefore, all ideals in R are finite principal ideals, which can be listed as follows

$$\{\langle g(x)\rangle {{\,\mathrm{mod}\,}}N-1| g(x)~\mathrm{divide}~x^{N}-1\}.$$

where $\langle g(x)\rangle {{\,\mathrm{mod}\,}}x^{N}-1$ represents the principal ideal generated by g(x) in R, that is

$$\begin{aligned} \langle g(x)\rangle {{\,\mathrm{mod}\,}}x^{N}-1=\{g(x)f(x)|0\le \deg f(x)\le N-\deg (g(x))-1\}. \end{aligned}$$

(7.173)

This proves that $\mathbb {F}_{q}[x]/\langle x^{N}-1\rangle $ is a ring of principal ideals, and the number of principal ideals is the number $d+1$ of positive factors of $x^{N}-1$. The so-called positive factor is a polynomial with the first term coefficient of 1. Therefore, the Corollary is as follows:

Corollary 7.11

Let d be the number of positive factors of $x^{N}-1$, then the number of cyclic codes with length N is $d+1$.

A cyclic code C corresponds to an ideal $C(x)=\langle g(x) \rangle {{\,\mathrm{mod}\,}}x^{N}-1$ in R, we define

Definition 7.14

Let C be a cyclic code, if $C(x)=\langle g(x)\rangle {{\,\mathrm{mod}\,}}x^{N}-1$, then g(x) is called the generating polynomial of C, where $g(x)|x^{N}-1$.

If $g(x)=x^{N}-1$, then $\langle x^{N}-1\rangle {{\,\mathrm{mod}\,}}x^{N}-1=0$, corresponding to zero ideal in R. Thus, the corresponding cyclic code $C=\{0=00\cdots 0\}$ is called zero code. If $g(x)=1$, then $ \langle g(x)\rangle {{\,\mathrm{mod}\,}}x^{N}-1=R$. The corresponding code $C=\mathbb {F}_{q}^{N}$. Therefore, there are always two trivial cyclic codes in cyclic codes of length N, zero code and $\mathbb {F}_{q}^{N}$, which correspond to zero ideal in R and R itself, respectively.

Lemma 7.56

Let $g(x)|x^{N}-1$, g(x) be the generating polynomial of cyclic code C, and $\deg g(x)=N-k$, then C is [N, k] linear code, further, let $g(x)=g_{0}+g_{1}x+\cdots +g_{N-k-1}x^{N-k-1}+g_{N-k}x^{N-k}$, the corresponding codeword $g=(g_{0}, g_{1}, \ldots , g_{N-k}, 0, 0$, $\ldots , 0)\in C$, then the generating matrix G of C is

$$\begin{aligned} G=\left[ \begin{array}{cccc} g \\ gT_{1} \\ \vdots \\ gT_{1}^{k-1} \\ \end{array}\right] _{k\times N}. \end{aligned}$$

(7.174)

Proof

Let C correspond to ideal $C(x)=\langle g(x)\rangle {{\,\mathrm{mod}\,}}x^{N}-1$, then $g(x),xg(x),\ldots $, $x^{k-1}g(x)\in C(x)$, their corresponding codewords are $\{g, gT_{1},\ldots , gT_{1}^{k-1}\}\subset C$, let’s prove that $\{g,gT_{1},\ldots ,gT_{1}^{k-1}\}$ is a set of bases of C. If $\exists ~ a_{i}\in \mathbb {F}_{q}\Rightarrow \sum _{i=0}^{k-1}a_{i}gT_{1}^{i}=0$, then its corresponding polynomial is 0, that is

$$\left( \sum _{i=0}^{k-1}a_{i}gT_{1}^{i}\right) (x)=\sum _{i=0}^{k-1}a_{i}gT_{1}^{i}(x)=\sum _{i=0}^{k-1}a_{i}x^{i}g(x)=0.$$

Thus

$$\sum _{i=0}^{k-1}a_{i}x^{i}=0 \Rightarrow \forall ~a_{i}=0,0\le i\le k-1.$$

That is, $\{g, gT_{1}, \ldots , gT_{1}^{k-1}\}$ is a linear independent group in C. Further $\forall c\in C$, we can prove that c can be expressed linearly. suppose $c\in C$, then $c(x)\in C(x)$, by (7.174), there is f(x),

$$\begin{aligned} \begin{aligned} f(x)&=f_{0}+f_{1}x+\cdots +f_{k-1}x^{k-1}\Rightarrow c(x)=g(x)f(x)\\&=\sum _{i=0}^{k-1}f_{i}x^{i}g(x)\Rightarrow c =\sum _{i=0}^{k-1}f_{i}gT_{1}^{i}. \end{aligned} \end{aligned}$$

This proves that the dimension of linear subspace C is $N-\deg g(x)=k$; that is, C is [N, k] linear code. Its generating matrix G is

$$G=\left[ \begin{array}{cccc} g \\ gT_{1} \\ \vdots \\ gT_{1}^{k-1} \\ \end{array}\right] _{k\times N}. $$

The Lemma holds.

Next, we discuss the dual code of cyclic code and its check matrix.

Lemma 7.57

Let $C\subset \mathbb {F}_{q}^{N}$ be a cyclic code and g(x) be the generating polynomial of g(x), $\deg g(x)=N-k$, let $g(x)h(x)=x^{N}-1,h(x)=h_{0}+h_{1}x+\cdots +h_{k}x^{k},h=(h_{0},h_{1},\ldots ,h_{k},0,0,\cdots ,0)\in \mathbb {F}_{q}^{N}$ is the corresponding codeword. $\overline{h}$ is the reverse order codeword, then the check matrix of C is

$$\begin{aligned} H=\left[ \begin{array}{cccc} \overline{h} \\ \overline{h}T_{1} \\ \vdots \\ \overline{h}T_{1}^{N-k-1}\\ \end{array}\right] _{(N-k)\times N}. \end{aligned}$$

(7.175)

The dual code $C^{\bot }$ of C is $[N,N-k]$ linear code, and

$$C^{\bot }=\{aH|a\in \mathbb {F}_{q}^{N-k}\},$$

h(x) is called the check polynomial of cyclic code C.

Proof

By Lemma 7.56, C is a k-dimensional linear subspace, and the generating matrix G is given by (7.175). Because of $g(x)h(x)=x^{N}-1$, then there is $g(x)h(x)=0$ in ring R. Equivalently,

$$g_{0}h_{i}+g_{1}h_{i-1}+\cdots +g_{N-k}h_{i-N+k}=0,\forall ~ 0\le i\le N-1.$$

The matrix of the above formula is expressed as $GH'=0$, so H is the generation matrix of dual code of C, and we have Lemma holds.

Remark 7.5

The polynomial $\overline{h(x)}$ corresponding to the reverse codeword $\overline{h}$ is

$$\overline{h(x)}=h_{0}x^{N-1}+h_{1}x^{N-2}+\cdots +h_{k}x^{N-k-1}.$$

In general, when $h(x)|x^{N}-1$, $\overline{h(x)}\not \mid x^{N}-1$, therefore, the dual code of cyclic code is not necessarily cyclic code.

Definition 7.15

Let $x^{N}-1=g_{1}(x)g_{2}(x)\cdots g_{t}(x)$ be the irreducible decomposition of $x^{N}-1$ on $\mathbb {F}_{q}$, where $g_{i}(x)(1\le i\le t)$ is the irreducible polynomial with the first term coefficient of 1 in $\mathbb {F}_{q}[x]$. Then the cyclic code generated by $g_{i}(x)$ is called the i-th maximal cyclic code in $\mathbb {F}_{q}^{N}$, denote as $M_{i}^{+}$. The cyclic code generated by $\frac{x^{N}-1}{g_{i}(x)}$ is called the i-th minimal cyclic code, denote as $M_{i}^{-}$.

Minimal cyclic codes are also called irreducible cyclic codes because they no longer contain the nontrivial cyclic codes of $\mathbb {F}_{q}^{N}$ in $M_{i}^{-}$. The irreducibility of minimal cyclic codes can be derived from the fact that the ideal $M_{i}^{-}(x)$ in R corresponding to $M_{i}^{-}$ is a field. We can give a proof of pure algebra.

Corollary 7.12

Let $M_{i}^{-}$ be the i-th minimal cyclic code of $\mathbb {F}_{q}^{N} (1\le i\le t)$, $M_{i}^{-}(x)$ is the ideal corresponding to $M_{i}^{-}$ in R, then $M_{i}^{-}(x)$ is a field, thus, $ M_{i}^{-}$ no longer contains any nontrivial cyclic code of $\mathbb {F}_{q}^{N}$.

Proof

Let $g(x)=(x^{N}-1)/g_{i}(x)$, $g_{i}(x)$ be an irreducible polynomial in $\mathbb {F}_{q}[x]$, by (7.175),

$$M_{i}^{-}(x)=g(x)\mathbb {F}_{q}[x]/(x^{N}-1)\mathbb {F}_{q}[x]\cong \mathbb {F}_{q}[x]/g_{i}(x)\mathbb {F}_{q}[x],$$

where $g(x)\mathbb {F}_{q}[x]$ is the principal ideal generated by g(x) in $\mathbb {F}_{q}[x]$. Since $g_{i}(x)$ is an irreducible polynomial, so $M_{i}^{-}(x)$ is a field.

Example 7.1

All cyclic codes with length of 7 are determined on binary finite field $\mathbb {F}_{2}$.

Solve: Polynomial $x^{7}-1$ has the following irreducible decomposition on $\mathbb {F}_{2}$

$$x^{7}-1=(x-1)(x^{3}+x+1)(x^{3}+x^{2}+1).$$

Therefore, $x^{7}-1$ has 7 positive factors on $\mathbb {F}_{2}$, by Corollary 7.11, there are 8 cyclic codes with length of 7 on $\mathbb {F}_{2}$. Where 0 and $\mathbb {F}_{2}^{7}$ are two trivial cyclic codes. There are three maximal cyclic codes generated by $g(x)=x-1,g(x)=x^{3}+x+1$ and $g(x)=x^{3}+x^{2}+1$, respectively. The dimensions of the corresponding cyclic codes are 6 dimension, 4 dimension and 4 dimension. Similarly, there are three minimal cyclic codes, corresponding to the dimension of one and two three-dimensional cyclic codes.

Another characterization of cyclic codes is zeroing polynomials, if $x^{N}-1=g_{1}(x)\cdots g_{t}(x)$, the ideal $M_{i}^{+}(x)$ in R corresponding to the maximum cyclic code $M_{i}^{+}(1\le i\le t)$ generated by $g_{i}(x)$ is

$$M_{i}^{+}(x)=\{g_{i}(x)f(x)|0\le \deg f(x)\le N-\deg g_{i}(x)-1\}.$$

Let $\beta $ be a root of $g_{i}(x)$ in the split field. Then $g_{i}(x)$ is the minimal polynomial of $\beta $ in $\mathbb {F}_{q}[x]$, all $c(x)\in M_{i}^{+}(x)\Rightarrow c(\beta )=0$. Therefore,

$$M_{i}^{+}(x)=\{c(x)|c(x)\in R,\text {and}~c(\beta )=0\}.$$

Example 7.2

Suppose $N=(q^{m}-1)/q-1,(m,q-1)=1, \beta $ is an N-th primitive unit root in $\mathbb {F}_{q^{m}}$, then the cyclic code

$$C=\{c(x)|c(\beta )=0,c(x)\in R\}$$

is equivalent to Hamming code $[N,N-m]$.

Proof

Because $(m,q-1)=1$, and

$$N=q^{m-1}+q^{m-2}+\cdots +q+1=(q-1)(q^{m-2}+2q^{m-3}+\cdots +(m-1))+m.$$

So $(N,q-1)=1$. Therefore, $\beta ^{i(q-1)}\ne 1$, for $1\le i\le N-1$, in other words, $\beta ^{i}\notin \mathbb {F}_{q}$ for $\forall ~1\le i\le N-1$ holds. In $\mathbb {F}_{q^{m}}$, any two elements of $\{1,\beta ,\beta ^{2},\ldots ,\beta ^{N-1}\}$ are linearly independent on $\mathbb {F}_{q}$. If each element is regarded as an m-dimensional column vector on $\mathbb {F}_{q}$, then the $m\times N$-order matrix

$$H=[1,\beta ,\beta ^{2},\ldots ,\beta ^{N-1}]_{m\times N}$$

constitutes the check matrix of cyclic code C, and any two rows of H are linearly independent on $\mathbb {F}_{q}$, by the definition, C is $[N,N-m]$ Hamming code.

Lemma 7.58

Let $C\subset \mathbb {F}_{q}^{N}$ be a cyclic code, $C(x)\subset F_{q}[x]/\langle x^{N}-1 \rangle $ be an ideal, $(N,q)=1$, then C(x) contains a multiplication unit element $c(x)\in C(x)\Rightarrow $

$$c(x)d(x)\equiv d(x)({{\,\mathrm{mod}\,}}x^{N}-1),\forall ~ d(x)\in C(x).$$

The unit element c(x) in C(x) is unique.

Proof

Because $(N,q)=1\Rightarrow x^{N}-1$ has no double root in $\mathbb {F}_{q}$, let g(x) be the generating polynomial of C and h(x) be the checking polynomial of C, that is $g(x)h(x)=x^{N}-1$. Therefore, $(g(x),h(x))=1$, and there is $a(x), h(x)\in \mathbb {F}_{q}[x],\Rightarrow $

$$a(x)g(x)+b(x)h(x)=1.$$

Let $c(x)=a(x)g(x)=1-b(x)h(x)\in C(x)$, so for $\forall ~d(x)\in C(x)$, write $d(x)=g(x)f(x)$, thus

$$\begin{aligned} \begin{aligned} c(x)d(x)&=a(x)g(x)g(x)f(x)\\&=(1-b(x)h(x))g(x)f(x)\\&=g(x)f(x)-b(x)h(x)g(x)f(x). \end{aligned} \end{aligned}$$

Therefore

$$c(x)d(x)\equiv d(x)({{\,\mathrm{mod}\,}}x^{N}-1).$$

There is $c(x)d(x)=d(x)$ in $R=\mathbb {F}_{q}[x]/\langle x^{N}-1 \rangle $. That is, c(x) is the multiplication unit element of C(x). obviously, c(x) exists only. The Lemma holds.

Definition 7.16

$C\subset \mathbb {F}_{q}^{N}$ is a cyclic code, and the multiplication unit element c(x) in C(x) is called the idempotent element of C. If $C=M_{i}^{-}$ is the i-th minimal cyclic code, the idempotent element of C is called the primitive idempotent element, denote as $\theta _{i}(x)$.

Lemma 7.59

Let $C_{1}\subset \mathbb {F}_{q}^{N},C_{2}\subset \mathbb {F}_{q}^{N}$ are two cyclic codes, $(N,q)=1$, Idempotent elements are $c_{1}(x)$ $c_{2}(x)$, respectively, then

(i)

$C_{1}\bigcap C_{2}$ is also the cyclic code of $\mathbb {F}_{q}^{N}$, idempotent element is $c_{1}(x)c_{2}(x)$.
(ii)

$C_{1}+C_{2}$ is also the cyclic code of $\mathbb {F}_{q}^{N}$, idempotent element is $c_{1}(x)+c_{2}(x)+c_{1}(x)c_{2}(x)$.

Proof

It is obvious that $C_{1}\bigcap C_{2}$ and $C_{1}+C_{2}$ are cyclic codes in $\mathbb {F}_{q}^{N}$, because they correspond to ideal $C_{1}(x)$ and $C_{2}(x)$ in R, we have

$$C_{1}(x)\cap C_{2}(x) ~\text {and}~C_{1}(x)+C_{2}(x)$$

is still the ideal in R. Therefore, the corresponding codes $C_{1}\cap C_{2}$ and $C_{1}+C_{2}$ are still cyclic codes, and the conclusion on idempotents is not difficult to verify. The Lemma holds.

In 1959, A. Hocquenghem and 1960, R. Bose and D. Chaudhuri independently proposed a special class of cyclic codes, which required minimal distance. At present, it is generally called BCH codes in academic circles.

Definition 7.17

A cyclic code $C\subset \mathbb {F}_{q}^{N}$ with length N is called a $\delta $-BCH code. If its generating polynomial is the least common multiple of the minimal polynomial of $\beta ,\beta ^{2},\ldots ,\beta ^{\delta -1}$, where $\delta $ is a positive integer, $\beta $ is a primitive N-th unit root. $\delta $-BCH code is also called BCH code with design distance of $\delta $. If $\beta \in \mathbb {F}_{q^{m}},N=q^{m}-1$, such BCH codes are called primitive.

Lemma 7.60

Let d be the minimal distance of a $\delta $-BCH code, then we have $d \ge \delta $.

Proof

Suppose $x^{N}-1=(x-1)g_{1}(x)g_{2}(x)\cdots g_{t}(x)$, $\beta $ is a primitive N-th unit root on $\mathbb {F}_{q}$, then $\beta $ is the root of a $g_{i}(x)$. Let $\deg g_{i}(x)=m\Rightarrow \beta \in \mathbb {F}_{q^{m}}$. Because of $[\mathbb {F}_{q^{m}}:\mathbb {F}_{q}]=m$, we can think of $\beta ,\beta ^{2},\ldots ,\beta ^{\delta -1}$ as an m-dimensional column vector. Let H be the following $m(\delta -1)\times N$-order matrix.

$$H= \left[ \begin{array}{ccccc} 1 &{} \beta &{} \beta ^{2} &{} \cdots &{} \beta ^{N-1} \\ 1 &{} \beta ^{2} &{} \beta ^{4} &{} \cdots &{} \beta ^{2(N-1)} \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ 1 &{} \beta ^{\delta -1} &{} \beta ^{2(\delta -1)} &{} \cdots &{} \beta ^{(N-1)(\delta -1)} \end{array}\right] _{m(\delta -1)\times N}.$$

In fact, H is the check matrix of $\delta $-BCH code C, that is

$$c\in C\Longleftrightarrow cH'=0.$$

We prove that any $(\delta -1)$ column vectors of H are linear independent vectors. Let the first component of these $(\delta -1)$ column vectors be $\beta ^{i_{1}},\beta ^{i_{2}},\ldots ,\beta ^{i_{\delta -1}}$, where $i_{j}\ge 0$, the corresponding determinant is Vandermonde determinant $\triangle $, and

$$\triangle =\beta ^{i_{1}+i_{2}+\cdots +i_{\delta -1}}\prod _{r>s}(\beta ^{i_{r}}-\beta ^{i_{s}})\ne 0.$$

Therefore, any $(\delta -1)$ column vectors of H are linearly independent. Thus, the minimum distance of C is $d\ge \delta $.

Now, we can introduce the design principle of McEliece/Niederreiter cryptosystem. Its basic mathematical idea is based on the decoding principle of error correction code. Recall the concept of error correction code in Chap. 2, a code $C\subset \mathbb {F}_{q}^{N}$ is called t-error correction code ($t\ge 1$ is a positive integer). If for $\forall ~ y\in \mathbb {F}_{q}^{N}$, there is at most one codeword $c\in C\Rightarrow d(c,y)\le t, d(c,y)$ is the Hamming distance between c and y. We know that if the minimum distance of a code C is d, then C is a t-error correction code, where $t=\left\lceil \frac{d-1}{2} \right\rceil $ is the smallest integer not less than $\frac{d-1}{2}$. Lemma 7.60 proves the existence of t-error correction codes for any positive integer t, i.e., $2t+1$-BCH code $(\delta =2t+1)$, this kind of code is called Goppa code (see the next section), which provides a theoretical basis for McEliece/Niederreiter cryptosystem. Next, we will introduce the working mechanism of this kind of cryptosystem in detail. First, let’s look at the generation of key.

Private key: Select a t-error correction code $C\subset \mathbb {F}_{q}^{N},C=[N,k]$, H is the check matrix of C, H is an $(N-k)\times N$-dimensional matrix. For $\forall ~x\in \mathbb {F}_{q}^{N}, x\rightarrow xH'\in \mathbb {F}_{q}^{N-k}$ is a correspondence of Spaces $\mathbb {F}_{q}^{N}$ to $\mathbb {F}_{q}^{N-k}$, let’s prove that this correspondence is a single shot on a special codeword whose weight is not greater than t.

Lemma 7.61

$\forall ~x, y \in \mathbb {F}_{q}^{N}$, if $xH'=yH'$, and $w(x)\le t,w(y)\le t$, then $x=y$.

Proof

By hypothesis,

$$xH'=yH'\Rightarrow (x-y)H'=0\Rightarrow x-y\in C.$$

Obviously, the Hamming distance $d(0,x)=w(x)\le t$ between x and 0, and the Hamming distance $d(x,x-y)$ between x and $x-y$ is

$$d(x,x-y)=w(x-(x-y))=w(y)\le t.$$

Because C is t-error correction code, then $x-y=0$, the Lemma holds.

We use t-error correction code C and check matrix H as the private key.

Public key: In order to generate the public key, we randomly select a permutation matrix $P_{N\times N}$ so that $I_{N}$ is an N-order identity matrix, $I_{N}=[e_{1},e_{2},\ldots ,e_{N}],\sigma \in S_{N}$ is an N-ary substitution, then

$$P=\sigma (I_{N})=[e_{\sigma (1)},e_{\sigma (2)}, \ldots , e_{\sigma (N)}].$$

This kind of matrix is also called Wyel matrix. A nonsingular diagonal matrix $\mathrm{diag}\{\lambda _{1},\lambda _{2},\ldots ,\lambda _{N}\}(\lambda _{i}\in \mathbb {F}_{q},\lambda _{i}\ne 0)$ can also be randomly selected, and suppose

$$P=\sigma (\mathrm{diag}\{\lambda _{1},\lambda _{2},\ldots ,\lambda _{N}\}) =\mathrm{diag}\{\lambda _{\sigma _{1}},\lambda _{\sigma _{2}},\ldots ,\lambda _{\sigma _{N}}\}.$$

Let M be an $(N-k)\times (N-k)$-order invertible matrix. The public key is the $(N-k)\times N$-order matrix K generated as follows,

$$K=PH'M, \mathrm{this} \; \mathrm{is}~N\times (N-k)~\mathrm{order matrix}.$$

We take K as the public key and H, P and M as the private key.

Encryption: Let $m\in \mathbb {F}_{q}^{N}$ be a codeword, $w(m)\le t$, encrypt m as plaintext as follows.

$$c=mK\in \mathbb {F}_{q}^{N-k},c~\text {is cryptosystem text}.$$

In fact, a plaintext with length N and weight no greater than t on $\mathbb {F}_{q}$ is encrypted into a cryptosystem text with length $(N-k)$ on $\mathbb {F}_{q}$ through public key K.

Decrypt: After receiving cryptosystem text c, decrypt it through private keys H, P and M.

$$c\cdot M^{-1}=mKM^{-1}=mPH'MM^{-1}=mPH'.$$

Since $mP\in \mathbb {F}_{q}^{N}$ and m have the same root, that is

$$w(m)=w(mP)\le t.$$

Using the decoding principle of error correction code: all codewords $xH'=mPH'$ satisfying $x\in \mathbb {F}_{q}^{N}$ actually constitute an additive coset of code C, as the leader vector of this additive coset, mP can be obtained accurately. That is

$$mPH'{\mathop {\longrightarrow }\limits ^{\text {decode}}}mP.$$

Finally, we have $m=(mP)\cdot P^{-1}$, and get plaintext.

7.9 Ajtai/Dwork Cryptosystem

By choosing an appropriate $n\times m$-order matrix $A\in \mathbb {Z}_q^{n\times m}$, two m-dimensional q-element lattices $\Lambda _q(A)$ and $\Lambda _q^{\bot }(A)$ are defined (see (7.45) and (7.46)),

$$\begin{aligned} \Lambda _q(A)=\{y\in \mathbb {Z}^m|\exists ~x\in \mathbb {Z}^n\Rightarrow y\equiv A^{'}x({{\,\mathrm{mod}\,}}q)\} \end{aligned}$$

and

$$\begin{aligned} \Lambda _q^\bot (A)=\{y\in \mathbb {Z}^m| Ay\equiv 0({{\,\mathrm{mod}\,}}q)\}. \end{aligned}$$

Using matrix A, an anti-collision hash function can be defined:

$$\begin{aligned} f_A:\{0,1,\ldots ,d-1\}^m\rightarrow \mathbb {Z}_q^n, \end{aligned}$$

(7.176)

where for any $y\in \{0,1,\ldots ,d-1\}^m$, define $f_{A}(y)$ as

$$\begin{aligned} f_{A}(y)=Ay ~{{\,\mathrm{mod}\,}}q, \end{aligned}$$

(7.177)

If parameter d, q, n, m is satisfied

$$\begin{aligned} n\log q<m \log d\Rightarrow \frac{n\log q}{\log d}<m. \end{aligned}$$

(7.178)

Then Hash function $f_A$ will produce collision, that is there is y, $y^{'}\in \{0,1,\ldots ,d-1\}^m$, $y\ne y^{'}$, and $f_A(y)=f_A(y^{'})$. By (7.177), we have it directly

$$\begin{aligned} A(y-y^{'})\equiv 0({{\,\mathrm{mod}\,}}q)\Rightarrow y-y^{'} \in \Lambda _q^\bot (A), \end{aligned}$$

this shows that the collision points y and $y^{'}$ of Hash function $f_A$ directly lead to a shortest vector $y-y^{'}$ on q-element lattice $\Lambda _q^\bot (A)$.

In order to obtain the anti-collision Hash function, the selection of $n\times m$-order matrix A is very important. First, we can select the parameter system: let $d=2$, $q=n^2$, n|m, and $m\log 2>n \log q$, where n is a positive integer. In Ajtai/Dwork cryptographic algorithm, there are two choices of parameter matrix A, one is cyclic matrix and the other is more general ideal matrix. Their corresponding q-element lattice $\Lambda _q^{\bot }(A)$ are cyclic lattice and ideal lattice, respectively.

Cyclic lattice

Because n|m, A can be divided into $\frac{m}{n}$ $n\times n$-order cyclic matrices, that is

$$\begin{aligned} A=[A^{(1)},A^{(2)},\ldots ,A^{(\frac{m}{n})}], \end{aligned}$$

(7.179)

where $\alpha ^{(i)}\in \mathbb {Z}_q^n$ is the n-dimensional column vector and $A^{(i)}$ is the cyclic matrix generated by $\alpha ^{(i)}$ (see (7.149)), that is

$$\begin{aligned} A^{(i)}=T^*(\alpha ^{(i)})=[\alpha ^{(i)},T\alpha ^{(i)},\ldots ,T^{n-1}\alpha ^{(i)}], 1\le i\le \frac{m}{n}. \end{aligned}$$

A is called an $n\times m$-dimensional generalized cyclic matrix, and the q-element lattice in $\mathbb {R}^m$ defined by A,

$$\begin{aligned} \Lambda _q^{\bot }(A)=\{y\in \mathbb {Z}^m| Ay\equiv 0({{\,\mathrm{mod}\,}}q)\} \end{aligned}$$

is called a cyclic lattice. The Ajtai/Dwork cryptosystem based on cyclic lattice can be stated as follows:

Algorithm 1: Hash function based on cyclic lattice.

Parameter: q, n, m, d is a positive integer, $n \mid m, m \log d >n \log q$.

Secret key: $\frac{m}{n}$ column vectors $\alpha ^{(i)}\in \mathbb {Z}_{q}^{n}, 1 \le i \le \frac{m}{n}$.

Hash function $f_A:\{0,1,\ldots , d-1\}^{m}\longrightarrow \mathbb {Z}_{q}^{n}$ define as

$$f_{A}(y)\equiv Ay ({{\,\mathrm{mod}\,}}q),$$

the cyclic matrix $A\in \mathbb {Z}_{q}^{n\times m}$ is given by (7.179).

We can extend the above concepts of cyclic matrix and cyclic lattice to more general cases and obtain the concepts of ideal matrix and ideal lattice. Let h(x) be the first integer coefficient polynomial of n degree, $h(x)=x^n+a_{n-1}x^{n-1}+\cdots +a_{1}x+a_{0}\in \mathbb {Z}[x]$, define the rotation matrix $T_{h}$ as

$$\begin{aligned} T_{h}=\left( \begin{array}{cccc}0 &{} \cdots &{} 0 &{} -a_{0} \\ &{} &{} &{} -a_{1} \\ &{} I_{n-1} &{} &{} \vdots \\ &{} &{} &{} -a_{n-1}\end{array}\right) , \end{aligned}$$

(7.180)

if $h(x)=x^{n}-1$ is a special polynomial, then $T_{h}=T$. T is highlighted in Sect. 7.7 of this chapter. Here, we discuss the more general $T_{h}$. Obviously, when the constant term $a_0 \ne 0$, $T_{h}$ is a reversible n-order square matrix, and $T_{h}=\det (T_{h})=(-1)^{n}a_{0}$.

Lemma 7.62

The characteristic polynomial of rotation matrix $T_{h}$ is $f(\lambda )=h(\lambda )$.

Proof

By the definition, the characteristic polynomial $f(\lambda )$ of $T_{h}$ is

$$\begin{aligned} \begin{aligned} f(\lambda )&=\det (\lambda I_{n}-T_{h})\\&=\left| \begin{array}{ccccc} \lambda &{} 0 &{} \cdots &{} 0 &{} a_{0} \\ -1 &{} \lambda &{}\cdots &{} \vdots &{} \vdots \\ \cdots &{} \cdots &{} \cdots &{} \cdots &{} \cdots \\ 0 &{} \cdots &{} \cdots &{}\lambda &{} a_{n-2} \\ 0 &{} \cdots &{} \cdots &{} -1 &{} a_{n-1} \end{array}\right| \\&=\frac{1}{\lambda }\frac{1}{\lambda ^2}\cdots \frac{1}{\lambda ^{n-1}}\left| \begin{array}{ccccc} \lambda &{} 0 &{} \cdots &{} 0 &{} a_{0} \\ 0 &{} \lambda &{}\cdots &{} \vdots &{} a_{1}\lambda +a_{0} \\ \cdots &{} \cdots &{} \cdots &{} \cdots &{} \cdots \\ 0 &{} \cdots &{} \cdots &{} \cdots &{}\lambda ^n+a_{n-1}\lambda ^{n-1}+\cdots +a_{1}\lambda +a_{0} \end{array}\right| \\&=\lambda ^n+a_{n-1}\lambda ^{n-1}+\cdots +a_{1}\lambda +a_{0}=h(\lambda ). \end{aligned} \end{aligned}$$

Lemma 7.63

Let $h(x)=x^n+a_{n-1}x^{n-1}+\cdots +a_{1}x+a_{0}\in \mathbb {Z}[x]$, if $a_{0}\ne 0$, then the rotation matrix $T_{h}$ is a reversible n-order square matrix, and

$$\begin{aligned} T_{h}^{-1}= \left[ \begin{array}{cc} -a_{0}^{-1}\alpha &{} I_{n-1} \\ -a_{0}^{-1} &{} 0 \end{array}\right] , ~\alpha =\left[ \begin{array}{cccc} a_{1}\\ a_{2}\\ \vdots \\ a_{n-1} \end{array}\right] \in \mathbb {Z}^{n-1}. \end{aligned}$$

Proof

By the definition of $T_{h}$,

$$\begin{aligned} \begin{aligned} T_{h}\cdot \left[ \begin{array}{cc} -a_{0}^{-1}\alpha &{} I_{n-1} \\ -a_{0}^{-1} &{} 0 \end{array}\right]&= \left[ \begin{array}{cc} 0&{} -a_{0}\\ I_{n-1} &{} -\alpha \end{array}\right] \left[ \begin{array}{cc} a_{0}^{-1}\alpha &{} I_{n-1} \\ -a_{0}^{-1} &{} 0 \end{array}\right] \\&= \left[ \begin{array}{cc} 1&{} 0 \\ 0 &{} I_{n-1} \end{array}\right] =I_{n}. \end{aligned} \end{aligned}$$

$$\begin{aligned} T_{h}^{-1}= \left[ \begin{array}{cc} -a_{0}^{-1}\alpha &{} I_{n-1} \\ -a_{0}^{-1} &{} 0 \end{array}\right] . \end{aligned}$$

For a given first polynomial $h(x)=x^n+a_{n-1}x^{n-1}+\cdots +a_{1}x+a_{0}\in \mathbb {Z}[x]$ of degree n, let R be a residue class ring of module h(x) in $\mathbb {Z}[x]$, i.e.,

$$\begin{aligned} R=\mathbb {Z}[x]/ \langle h(x)\rangle , \end{aligned}$$

(7.181)

where $\langle h(x)\rangle $ is the ideal generated by h(x) in $\mathbb {Z}[x]$. Because of $\deg h(x)=n$, then polynomial $g(x)\in R$ in R has a unique expression: $g(x)=g_{n-1}x^{n-1}+g_{n-2}x^{n-2}+\cdots +g_{1}x+g_{0}\in R$, define mapping $\sigma : R \longrightarrow \mathbb {Z}^{n}$ as

$$\begin{aligned} \sigma (g(x))=\left[ \begin{array}{cccc} g_{0}\\ g_{1}\\ \vdots \\ g_{n-1} \end{array}\right] \in \mathbb {Z}^{n}. \end{aligned}$$

(7.182)

Obviously, $\sigma $ is an Abel group isomorphism of $R \longrightarrow \mathbb {Z}^{n}$. Therefore, any polynomial g(x) in R can be regarded as an n-dimensional integer column vector.

Definition 7.18

For any n-dimensional column vector $g=\sigma (g(x))=\left[ \begin{array}{cccc} g_{0}\\ g_{1}\\ \vdots \\ g_{n-1} \end{array}\right] \in \mathbb {Z}^{n}$ in $\mathbb {Z}^n$, define

$$\begin{aligned} T_{h}^{*}(g)=[g, T_{h}(g), T_{h}^{2}(g), \ldots ,T_{h}^{n-1}(g)]_{n\times n}, \end{aligned}$$

(7.183)

the n-order square matrix $T_{h}^{*}(g)$ is called an ideal matrix generated by vector g.

Ideal matrix is a more general generalization of cyclic matrix. The former corresponds to a first n-degree polynomial h(x), and the latter corresponds to a special polynomial $x^n-1$. We first prove that the ideal matrix $T_{h}^{*}(g)$ and the rotation matrix $T_{h}$ generated by any vector $g\in \mathbb {Z}^{n}$ are commutative under matrix multiplication.

Lemma 7.64

For any given first n-degree polynomial $h(x)\in \mathbb {Z}[x]$, and n-dimensional column vector $g\in \mathbb {Z}^{n}$, we have

$$\begin{aligned} T_{h}\cdot T_{h}^{*}(g)=T_{h}^{*}(g)\cdot T_{h}. \end{aligned}$$

Proof

Let $h(x)=x^n+a_{n-1}x^{n-1}+\cdots +a_{1}x+a_{0}\in \mathbb {Z}[x]$, by Lemma 7.62, the characteristic polynomial of rotation matrix $T_{h}$ is $h(\lambda )$, then by Hamilton–Cayley theorem, we have

$$\begin{aligned} T_{h}^{n}+a_{n-1}T_{h}^{n-1}+\cdots +a_{1}T_{h}+a_{0}=0, \end{aligned}$$

(7.184)

there is

$$\begin{aligned} \begin{aligned} T_{h}^{*}(g)T_{h}&= [g,T_{h}g, T_{h}^{2}g,\ldots , T_{h}^{n-1}g] \left[ \begin{array}{cc} 0&{} -a_{0} \\ I_{n-1} &{} -\alpha \end{array}\right] \\&=[T_{h}g, T_{h}^{2}g,\ldots , -a_{0}g-a_{1}T_{h}g-\cdots -a_{n-1}T_{h}^{n-1}g]\\&=[T_{h}g, T_{h}^{2}g,\ldots ,( -a_{0}-a_{1}T_{h}-\cdots -a_{n-1}T_{h}^{n-1})g]\\&=[T_{h}g, T_{h}^{2}g,\ldots , T_{h}^{n}g]\\&=T_{h}[g, T_{h}g,\ldots , T_{h}^{n-1}g]\\&=T_{h}\cdot T_{h}^{*}(g). \end{aligned} \end{aligned}$$

When the monic n-degree integer coefficient polynomial h is selected, we want to establish the corresponding relationship between the ideal and the integer lattice $L\subset \mathbb {Z}^{n}$ in the quotient ring $R=\mathbb {Z}[x]/\langle h(x)\rangle $. First, we define the concept of an ideal lattice. In short, an ideal lattice is an integer lattice generated by the ideal matrix.

Definition 7.19

Let $g=(g_0,g_1,\ldots ,g_{n-1})^T \in \mathbb {Z}^n$ be a given column vector and $T_{h}^{*}(g)$ be the ideal matrix generated by g, and call the integer lattice $L=L(T_{h}^{*}(g))$ an ideal lattice.

Our main result is the 1-1 correspondence between ideal and ideal lattice in $R=\mathbb {Z}[x]/\langle h(x) \rangle $. This also explains the reason why $L(T_{h}^{*}(g))$ is called ideal lattice.

Theorem 7.10

The principal ideal in $R=\mathbb {Z}[x]/\langle h(x)\rangle $ 1-1 corresponds to the ideal lattice in $\mathbb {Z}^n$. Specifically,

(i)

If $N=\langle g(x) \rangle $ is any principal ideal in R, then
$$\sigma (N) =\{\sigma (f)|f \in N\}=L(T_{h}^{*}(\sigma (g(x))))=L(T_{h}^{*}(g)).$$
(ii)

If $g=(g_0,g_1,\ldots ,g_{n-1})^T \in \mathbb {Z}^n$, $T_{h}^{*}(g)\subset \mathbb {Z}^n$ is any ideal lattice, then
$$\sigma ^{-1}(T_{h}^{*}(g)) =\{\sigma ^{-1}(b)|b \in T_{h}^{*}(g)\}= \langle g(x) \rangle \subset R,$$
where $g(x)=g_0+g_1x+\cdots +g_{n-1}x^{n-1}=\sigma ^{-1}(g)$.

Proof

We first prove (i). Let $g(x)=g_0+g_1x+\cdots +g_{n-1}x^{n-1} \in R$ be a given polynomial, $N=\langle g(x)\rangle \subset R$ is a principal ideal generated by g(x) in R, by (7.182),

$$\begin{aligned} \sigma (g(x))=(g_0,g_1,\ldots ,g_{n-1})^T=T_{h}^{*}(g)\cdot \left[ \begin{array}{cccc} 1\\ 0\\ \vdots \\ 0 \end{array}\right] \in L(T_{h}^{*}(g)). \end{aligned}$$

And because

$$\begin{aligned} \begin{aligned} xg(x)&=g_{n-1}x^{n}+g_{n-2}x^{n-1}+\cdots +g_1x^2+g_0x\\&=(g_{n-2}-g_{n-1}a_{n-1})x^{n-1}+(g_{n-3}-g_{n-1}a_{n-2})x^{n-2}+\cdots \\&\quad +(g_0-g_{n-1}a_{1})x-g_{n-1}a_{0}, \end{aligned} \end{aligned}$$

$$\begin{aligned} \sigma (xg(x))= \left[ \begin{array}{cccc} -g_{n-1}a_{0}\\ g_0-g_{n-1}a_{1}\\ \vdots \\ g_{n-2}-g_{n-1}a_{n-1} \end{array}\right] =T_{h}\cdot g =T_{h}^{*}(g)\left[ \begin{array}{ccccc} 0\\ 1\\ 0\\ \vdots \\ 0 \end{array}\right] \in L(T_{h}^{*}(g)). \end{aligned}$$

For the same reason, for $0 \le k \le n-1$, we have

$$\begin{aligned} \sigma (x^kg(x))=T_{h}^{k}\cdot g=T_{h}^{*}(g)\cdot \left[ \begin{array}{cccccc} 0\\ 0\\ \vdots \\ 1\\ 0\\ 0 \end{array}\right] \in L(T_{h}^{*}(g)). \end{aligned}$$

Suppose $f(x)\in N =\langle g(x)\rangle $, then $f(x)=b(x)\cdot g(x)$, where $b(x)=b_0+b_1x+\cdots +b_{n-1}x^{n-1}$, then we have

$$\begin{aligned} \begin{aligned} \sigma (f(x))&=\sigma (b(x)g(x))\\&=\sum _{k=0}^{n-1}b_{k}\sigma (x^kg(x))\\&=T_{h}^{*}(g)\left[ \begin{array}{cccc} b_0\\ b_1\\ \vdots \\ b_{n-1} \end{array}\right] \in L(T_{h}^{*}(g)). \end{aligned} \end{aligned}$$

(7.185)

That proves

$$\sigma (x)=\sigma (\langle g(x)\rangle )\subset L(T_{h}^{*}(g)).$$

Conversely, for any lattice point $\alpha \in L(T_{h}^{*}(g))$, then

$$\alpha =T_{h}^{*}(g)b=T_{h}^{*}(g)\left[ \begin{array}{cccc} b_0\\ b_1\\ \vdots \\ b_{n-1} \end{array}\right] ,$$

since $\sigma $ is 1-1 corresponds, by (7.185), then

$$f(x)=\sigma ^{-1}(f(x))=\sigma ^{-1}(T_{h}^{*}(g)b)\in N=\langle g(x)\rangle .$$

So we have

$$\sigma (N)=\sigma (\langle g(x)\rangle )=L(T_{h}^{*}(g)).$$

(i) holds. Again, $\sigma $ is 1-1 corresponds, so (ii) can be derived directly. We complete the proof of Theorem 7.10.

The above discussion on ideal matrix and ideal lattice can be extended to a finite field $\mathbb {Z}_{q}$, because any quotient ring $\mathbb {Z}_{q}[x]/\langle h(x)\rangle $ on polynomial ring $\mathbb {Z}_{q}[x]$ in finite field is a principal ideal ring. Therefore, we can establish the 1-1 correspondence between all ideals in $R=\mathbb {Z}_{q}[x]/\langle h(x)\rangle $ and linear codes on $\mathbb {Z}_{q}$.

Back to the Ajtai/Dwork cryptosystem, let $h(x)\in \mathbb {Z}_{q}[x]$ be a given polynomial, and select an $n \times m$-dimensional matrix $A\in \mathbb {Z}_{q}^{n \times m}$ as the generalized ideal matrix, i.e.,

$$\begin{aligned} A=[A_1, A_2, \ldots , A_{\frac{m}{n}}], \end{aligned}$$

(7.186)

where $A_{i}(1\le i \le \frac{m}{n})$ is the ideal matrix generated by $g^{(i)}\in \mathbb {Z}_{q}^{n}$, that is

$$\begin{aligned} A_i=T_{h}^{*}(g^{(i)})=[g^{(i)}, T_{h}g^{(i)}, \ldots , T_{h}^{n-1}g^{(i)}], \end{aligned}$$

(7.187)

we get the second algorithm of Ajtai/Dwork cryptosystem:

Algorithm 2: Hash function based on ideal lattice.

Parameter: q, n, m, d are positive integers, n|m, $m\log d>n \log q$.

Secret key: $\frac{m}{n}$ column vectors $g^{(i)}\in \mathbb {Z}_{q}^{n}(1\le i \le \frac{m}{n})$, polynomial $h(x)=x^n+a_{n-1}x^{n-1}+\cdots +a_{1}x+a_{0}\in \mathbb {Z}_{q}[x]$.

Hash function $f_{A}:\{0,1, \ldots , d-1\}^m \longrightarrow \mathbb {Z}_{q}^{n}$ defined as

$$f_{A}(y)\equiv Ay ({{\,\mathrm{mod}\,}}q),$$

The ideal matrix $A\in \mathbb {Z}_{q}^{n \times m}$ is given by Eq. (7.186).

We will not introduce the anti-collision performance of hash functions constructed by cyclic lattices and ideal lattices here. Interested students can refer to the reference Micciancio and Regev (2009) in this chapter.

Exercise 7

1.

$L \subset \mathbb {R}^n$ is a lattice (full rank lattice), if $L^{*}$ is a dual lattice of L, then the integer lattice $L=\mathbb {Z}^{n}$ is a self-dual lattice, that is $(\mathbb {Z}^{n})^{*}=\mathbb {Z}^{n}$. Let $L=2\mathbb {Z}^{n}$, find $L^{*}=?$
2.

Is it correct that L is a self-dual lattice if and only if $L=\mathbb {Z}^{n}$? Why?
3.

Under the assumption of exercise 1, let $\lambda _1(L)$ be the shortest vector length of L and $\lambda _1(L^{*})$ be the shortest vector length of dual lattice $L^{*}$. Then
$$\lambda _1(L) \cdot \lambda _1(L^{*}) \le n.$$
4.

Let $\lambda _1(L), \lambda _2(L), \ldots , \lambda _n(L)$ be the length of the Successive Shortest vector of lattice L, prove
$$\lambda _1(L) \cdot \lambda _n(L^{*}) \ge 1.$$
5*.

Let L be a lattice, $B=[\beta _1, \beta _2, \ldots , \beta _n]$ is the generating matrix of L, $B^{*}=[\beta _1^{*}, \beta _2^{*}, \ldots , \beta _n^{*}]$ is the corresponding orthogonal matrix. Prove: any lattice L has a set of bases $\{\beta _1, \beta _2, \ldots , \beta _n\}$, such that
$$\frac{1}{n}\lambda _1(L) \le \min \{|\beta _1^{*}|, |\beta _2^{*}|, \ldots , |\beta _n^{*}|\}\le \lambda _1(L).$$
(Hint: use KZ basis on lattice L).
6.

Under the assumption of exercise 5, let $\lambda _1(L), \lambda _2(L), \ldots , \lambda _n(L)$ be the continuous minimum of lattice L, prove:
$$\lambda _j(L) \ge \min _{j \le i \le n}|\beta _i^{*}|, ~1 \le j \le n.$$
7.

For a full rank lattice $L \subset \mathbb {R}^n$, define its coverage radius $\mu (L)$ as
$$\mu (L)=\max _{x \in \mathbb {R}^n}|x-L|.$$
Prove: the covering radius of any lattice L exists.
8.

Prove: $\mu (\mathbb {Z}^{n})=\frac{1}{2}\sqrt{n}.$
9.

For any lattice $L \subset \mathbb {R}^n$, prove: $\mu (L) \ge \frac{1}{2}\lambda _n(L)$.
10.

For any lattice $L \subset \mathbb {R}^n$, prove the following theorem:
$$\lambda _1(L)\cdot \mu (L^{*})\le n.$$

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

previous chapter Elliptic Curve

Ajtai, M. (2004). Generating hard instances of lattice problems. In Quad. Mat.: Vol. 13. Complexity of computations and proofs (pp. 1–32). Dept. Math., Seconda Univ. Napoli. Preliminary version in STOC 1996.
Ajtai, M., & Dwork, C. (1997). A public-key cryptosystem with worst-case/average-case equivalence. In Proceedings of 29th Annual ACM Symposium on Theory of Computing (STOC) (pp. 284–293).
Babai, L. (1986). On Loväasz lattice reduction and the nearest lattice point problem. Combinatorica, 6, 1–13.View Article
Cassels, J. W. S. (1963). Introduction to diophantine approximation. Cambridge University Press.
Cassels, J. W. S. (1971). An introduction to the geometry of numbers. Springer.
Gama, N., & Nguyen, P. Q. (2008a). Finding short lattice vectors within Mordell’s inequality. In Proceedings of 40th ACM Symposium on Theory of Computing (STOC) (pp. 207–216).
Gama, N., & Nguyen, P. Q. (2008b). Predicting lattice reduction. In Lecture Notes in Computer Science: Advances in cryptology. Proceedings of Eurocrypt’08. Springer
Goldreich, O., Goldwasser, S., & Halevi, S. (1997). Public-key cryptosystems from lattice reduction problems. In Lecture Notes in Computer Science: Vol. 1294. Advances in cryptology (pp. 112–131). Springer.
Hoffstein, J., Pipher, J., & Silverman, J. H. (1998). NTRU: A ring based public key cryptosystem. In LNCS: Vol. 1423. Proceedings of ANTS-III (pp. 267–288). Springer.
Klein, P. (2000). Finding the closest lattice vector when it’s unusually close. In Proceedings of 11th Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 937–941).
Lenstra, A. K., Lenstra, H. W., Jr., & Lovasz, L. (1982). Factoring polynomials with rational coefficients. Mathematische Annalen, 261(4), 515–534.View Article
McEliece, R. (1978). A public-key cryptosystem based on algebraic number theory. Technical Report, Jet Propulsion Laboratory. DSN Progress Report 42-44.
Micciancio, D. (2001). Improving lattice based cryptosystems using the Hermite normal form. In J. Silverman (Ed.), Lecture Notes in Computer Science: Vol. 2146. Cryptography and lattices conference—CaLC 2001 (pp. 126–145). Springer.
Micciancio, D., & Regev, O. (2009). Lattice-based cryptography. Springer.
Niederreiter, H. (1986). Knapsack-type cryptosystems and algebraic coding theory. Problems of Control and Information Theory/Problemy Upravlen. Teor. Inform., 15(2), 159–166.
Peikert, C. (2016). A decade of lattice cryptography. Foundations & Trends in Theoretical Computer Science.
Regev, O. (2004). Lattices in computer science (Lecture 1–Lecture 7). Tel Aviv University, Fall.

Title: Lattice-Based Cryptography
Author: Zhiyong Zheng
Publisher: Springer Singapore
Book: Modern Cryptography Volume 1
Print ISBN: 978-981-19-0919-1

Electronic ISBN: 978-981-19-0920-7

Copyright Year: 2022
DOI: https://doi.org/10.1007/978-981-19-0920-7_7

Springer Professional

Abstract

7. Lattice-Based Cryptography

Abstract

7.1 Geometry of Numbers

Lemma 7.1

Proof

Definition 7.1

Lemma 7.2

Proof

Lemma 7.3

Proof

Lemma 7.4

Proof

Lemma 7.5

Proof

Lemma 7.6

Proof

Remark 7.1

Lemma 7.7

Proof

Remark 7.2

Corollary 7.1

Proof

Lemma 7.8

Proof

Definition 7.2

Lemma 7.9

Proof

Corollary 7.2

Proof

Lemma 7.10

Proof

Theorem 7.1

Proof

Corollary 7.3

Proof

7.2 Basic Properties of Lattice

Definition 7.3

Lemma 7.11

Proof

Lemma 7.12

Proof

Lemma 7.13

Proof

Theorem 7.2

Proof

Corollary 7.4

Definition 7.4

Lemma 7.14

Proof

Lemma 7.15

Proof

Corollary 7.5

Definition 7.5

Lemma 7.16

Proof

Lemma 7.17

Proof

Definition 7.6

Lemma 7.18

Proof

Corollary 7.6

Proof

Lemma 7.19

Proof

Lemma 7.20

Proof

Corollary 7.7

Proof

Lemma 7.21

Proof

7.3 Integer Lattice and q-Ary Lattice

Definition 7.7

Lemma 7.22

Proof

Lemma 7.23

Proof

Lemma 7.24

Proof