Multivariate signature belongs to Multivariate-Quadratic-Equations Public Key Cryptography (MPKC), which is secure to quantum computer attacks. Compared with RSA and ECC, it is required to speed up multivariate signature implementations. A high-speed hardware architecture for signature generations of a multivariate scheme is proposed in this paper. The main computations of signature generations of multivariate schemes are additions, multiplications, inversions, and solving systems of linear equations (LSEs) in a finite field. Thus, we improve the finite field multiplications via using composite field expression and design a finite field inversion via using binary trees. Besides, we improve solving LSEs in a finite field based on a variant algorithm of Gauss-Jordan elimination and use the XOR gates to compute additions. We implement the high-speed hardware architecture based on the above improvements on an Altera Stratix Field-Programmable Gate Array (FPGA), which shows that it takes only 90 clock cycles and 0.9 μs to generate a multivariate signature. The comparison shows that the hardware architecture is much faster than other implementations.
Abbreviations
ECC
Elliptic Curve Crypto
enTTS
Enhanced Tame Transformation Signature
FPGA
Field-Programmable Gate Array
LSE
Systems of Linear Equations
MPKC
Multivariate Quadratic Equations Public Key Cryptography
MQ
Multivariate Quadratic
PKC
Public Key Cryptography
RSA
Rivest-Shamir-Adleman
TTS
Tame Transformation Signature
UOV
Unbalanced Oil-Vinegar
1 Introduction
Quantum technology has developed rapidly in recent years. Quantum computer is in a position to attack RSA [1], ECC [2], and other signature algorithms adopted by many chips due to the algorithm by Peter Shor [3]. Therefore, chip security is facing severe threats.
Fortunately, there are a few post-quantum candidates for signature chips, in which multivariate signature is included [4]. Multivariate signature belongs to Multivariate Quadratic Equations Public Key Cryptography (MPKC), which is secure to quantum computer attacks and general computer attacks [5, 6]. MPKC is first proposed by Matsumoto and Imai in the 1980s. During the past 30 years, various schemes of MPKC have been proposed [7‐32], which includes Rainbow [28], Unbalanced Oil-Vinegar (UOV) [29], and Tame Transformation Signature (TTS) [30, 31]. Software and hardware implementations of multivariate signature schemes have been one of the topics of many researchers [33‐40]. enTTS belongs to the triangular family, which can be viewed as extensions of Tame Transformation Method (TTM) Among the existing enTTS schemes, enTTS(20,28) is believed to be one of the fastest signature schemes, which works with 20 B hashes and 28 B signatures. Compared with the implementations of other public key cryptosystems, e.g., RSA and ECC, we need to speed up multivariate signature generations.
Advertisement
Some previous works of efficient implementations of multivariate signature schemes are as follows:
The work in [33] proposed a fast implementation of SFlash;
The work in [34] presented an efficient public key generation for multivariate cryptosystems;
In [35], a minimized PKC of multivariate schemes on low-resource embedded systems was proposed;
Advertisement
In [36], an efficient implementation of multivariate quadratic systems was presented;
A fast implementation of Rainbow signature generation was proposed in [37];
A high-speed hardware architecture based on Rainbow signature on Field-Programmable Gate Arrays (FPGAs) was proposed in [38];
For low area design, a multivariate signature FPGA processor was proposed in [39];
The work of [40] was a time-area optimized design, which showed that multivariate cryptosystems are more efficient than ECC.
Among such multivariate signature schemes, enTTS is a Tame-like multivariate public key cryptosystem [32]. Hardware implementations of TTS (including enTTS) signature are mainly proposed in [31, 35, 36, 39, 40]. Most of these implementations are focusing on area optimizations. The main computations during generations of enTTS signature are addition, multiplication, inversion, and solving Systems of Linear Equations (LSEs) in a finite field.
Thus, the main contributions of this paper are as follows. We improve the finite field multiplications via using composite field expression and design a finite field inversion based on binary trees described in [41]. Besides, we improve solving LSEs in a finite field based on the algorithm of Gauss-Jordan elimination and use the XOR gates to compute additions.
We implement the high-speed hardware architecture based on the above improvements on an Altera Stratix FPGA. The comparison shows that the hardware architecture is much faster than other implementations of public key cryptosystems.
We organize the rest of this paper as follows: the algorithm of multivariate scheme is introduced in Section 2; the high-speed hardware architecture for multivariate signature generations is given in Section 3; implementation results of the high-speed hardware architecture on FPGAs and comparisons with related cryptosystems are given in Section 4; Conclusions are summarized in Section 5.
2 Method
This study originates from a need to speed up signature generations of multivariate scheme, since the efficiency of implementations should be improved as a quantum-resistance cryptosystems. Specifically, we propose a high-speed hardware architecture for multivariate scheme through improving finite field multiplications based on composite field expression, finite field inversions based on binary trees and solving LSEs based on the algorithm of Gauss-Jordan elimination.
We implement the high-speed hardware architecture based on the above improvements on an Altera Stratix FPGA and the comparison shows that the hardware architecture is much faster than other implementations.
The multivariate scheme, enTTS is employed to the architecture for hardware implementations of signature generations in a finite field. enTTS belongs to the triangular family, which can be viewed as extensions of Tame Transformation Method (TTM). enTTS is designed with a higher security level than TTS. We illustrate enTTS parameters in Table 1.
We use GF((24)2) for implementation of enTTS, which is a composite field of GF(256). We suppose that y denotes the message (20 B) of multivariate scheme and y0, y1, …, y19 denote each byte from the message, where y0, y1, …, y19 are elements in GF((24)2). We suppose that x denotes the signature (28 B) and x0, x1, …, x27 denote each byte of the signature, where x0, x1, …, x27 are elements in GF((24)2).
The construction of this signature scheme uses affine transformation L1, central map transformation F, and affine transformation L2.
In order to sign a message, i.e., y(y0, y1, …, y19), it is required to compute several steps for the following equation:
\( {L}_1^{-1} \) is an invertible affine transformation with the following form:
$$ \overline{y}= Ay+B. $$
A is a matrix with the size of 20 × 20, part of private keys of enTTS;
B is a vector with the size of 20, part of private keys of enTTS.
Then, \( \overline{y}\left({\overline{y}}_0,{\overline{y}}_1,\dots, {\overline{y}}_{19}\right) \) is the result of affine transformation L1, where \( {\overline{y}}_0,{\overline{y}}_1,\dots, {\overline{y}}_{19} \) are elements in GF((24)2).
Second, the following equation is required to solve:
We suppose that \( x\left({\overline{x}}_0,{\overline{x}}_1,\dots, {\overline{x}}_{27}\right) \) denote the result of central map transformation F, where \( {\overline{x}}_0,{\overline{x}}_1,\dots, {\overline{x}}_{27} \) are elements in GF((24)2). MQ polynomials f(f0, f1, …, f19) are defined by the following equations:
The first group variables of \( \overline{x} \), i.e., \( {\overline{x}}_0,{\overline{x}}_1,\dots, {\overline{x}}_7 \) are randomly chosen and then the first group polynomials of fi, i.e., f0, f1, …, f8 are evaluated.
Then, the second group variables of \( \overline{x} \) are \( {\overline{x}}_8,{\overline{x}}_9,\dots, {\overline{x}}_{16} \), and we solve the LSEs on such variables.
Next, we evaluate the second group polynomials of fi, i.e., f9, f10 and solve the third group variables of \( \overline{x} \), i.e., \( {\overline{x}}_{17},{\overline{x}}_{18} \).
Then, the third group polynomials of fi, i.e., f11, f12, …, f19 are evaluated and we solve the LSEs on such variables of the fourth group variables of \( \overline{x} \), i.e., \( {\overline{x}}_{19},{\overline{x}}_{20},\dots, {\overline{x}}_{27} \).
After that, the result of central map transformation F, i.e., \( x\left({\overline{x}}_0,{\overline{x}}_1,\dots, {\overline{x}}_{27}\right) \) is computed.
Last, we solve the following equations based on the values of \( {\overline{x}}_{19},{\overline{x}}_{20},\dots, {\overline{x}}_{27} \).
\( {L}_2^{-1} \) is an invertible affine transformation
$$ x=C\overline{x}+D. $$
C is a matrix with the size of 28 × 28, part of private keys of enTTS;
D is a vector with the size of 28, part of private keys of enTTS.
Finally, we have computed the signature of y(y0, y1, …, y19), which is x(x0, x1, …, x27).
3 A high-speed hardware architecture for multivariate signature
3.1 Overview of the hardware architecture
We choose enTTS (20,28) scheme described in Section 2 for hardware implementations in a composite field GF((24)2), where the size of message (hash value) is 20 B and the signature size is 28 B.
We illustrate the generation of a multivariate signature in Fig. 1. It can be observed from Fig. 1 that the signature generations of multivariate scheme include seven steps:
×
(1) Affine transformation L1.
\( {L}_1^{-1} \) is an invertible affine transformation with the following form.
$$ \overline{y}= Ay+B. $$
A is a matrix with the size of 20 × 20.
B is a vector with the size of 20.
It can be observed that \( {L}_1^{-1} \) is performed via matrix-vector multiplications and vector additions, where A and B are parts of private keys.
(2) Polynomial evaluation (first part F)
First, we randomly choose the variables of \( {\overline{x}}_0,{\overline{x}}_1,\dots, {\overline{x}}_7 \), i.e., the first group variables of \( \overline{x} \).
Second, we evaluate the polynomials of f0, f1, …, f8, i.e., the first group polynomials of fi.
After that, this part of polynomial evaluation is performed via using additions and multiplications in a finite field.
(3) Solving (LSEs) in a finite field
During the signature generations of multivariate scheme, it is required to perform solving LSEs twice with the same matrix of size 9 × 9.
First, for the second group variables of \( \overline{x} \), i.e., \( {\overline{x}}_8,{\overline{x}}_9,\dots, {\overline{x}}_{16} \), we solve the LSEs on such variables.
Second, for the fourth group variables of \( \overline{x} \), i.e., \( {\overline{x}}_{19},{\overline{x}}_{20},\dots, {\overline{x}}_{27} \), we solve the LSEs on such variables.
During this step, solving LSEs is performed via using a variant Gauss-Jordan elimination in a finite field.
(4) Polynomial evaluation (second part F)
The third group variables of \( \overline{x} \), i.e., \( {\overline{x}}_{17},{\overline{x}}_{18} \) are solved by evaluating the second group polynomials of fi, i.e., f9, f10.
This part of polynomial evaluation is performed via using additions and multiplications in a finite field;
(5) Polynomial evaluation (third part F)
We evaluate the third group polynomials of fi, i.e., f11, f12, …, f19.
This part of polynomial evaluation is performed via using additions and multiplications in a finite field.
(6) Affine transformation L2: \( {L}_2^{-1} \) is an affine transformation with the following form:
$$ x=C\overline{x}+D. $$
C: is a matrix with the size of 28 × 28.
D is a vector with the size of 28.
It can be observed that \( {L}_2^{-1} \) is performed via matrix-vector multiplications and vector additions, where C and D are parts of private keys.
Our hardware architecture for the signature generation of multivariate scheme is depicted in Fig. 2. It can be observed from Fig. 2 that the hardware architecture consists of adders, multipliers, inverter, parallel Gauss-Jordan eliminator, polynomial evaluation, matrix vector multiplication, vector addition, polynomial evaluation, and processor components in a finite field, where only the first four components are computing components and the others are logical components.
×
3.2 Performance evaluation of irreducible polynomial in composite fields
Irreducible polynomials in composite fields are involved in the additions, multiplications, and other operations during signature generations. Thus, the performance evaluation of the irreducible polynomial in the composite field GF((24)2) is very critical for the implementation of high-speed hardware architecture of multivariate scheme.
We suppose that q(x) denotes the irreducible polynomial in GF((24)2), and it has the following form.
$$ q(x)={x}^2+{q}_1x+{q}_0. $$
p1, p0 are elements in GF(24).
We suppose that p(x) denotes the irreducible polynomial in the subfield of GF((24)2), i.e., GF(24), and it has the following form:
$$ p(x)={x}^4+{p}_3{x}^3+{p}_2{x}^2+{p}_1x+1. $$
p3, p2, p1 are bits, i.e., 0 or 1.
The performance of the multiplications and inversions has been evaluated based on such irreducible polynomials, respectively. q(x) = x2 + x + 9 is chosen as the irreducible polynomials in GF((24)2) and p(x) = x4 + x + 1 is chosen as the irreducible polynomials in the subfield GF(24).
3.3 Finite field adder
Let a(x) = ahx + al and b(x) = bhx + bl be the elements in GF((24)2), where ah, al, bh, and bl are elements in GF(24).
Then the addition of a(x) and b(x) can be expressed as
We perform the polynomial multiplication and reduction module the irreducible polynomial q(x) = x2 + x + 9. We suppose that ch and cl are elements in GF(24), and we can compute their values via the following expressions:
It can be observed that the critical path of multiplication of two elements in GF((24)2) includes one multiplication, one constant multiplication, and one addition in GF(24).
p(x) is the irreducible polynomial in GF(24). Let \( a(x)=\sum \limits_{i=0}^3{a}_i{x}^i \) and \( b(x)=\sum \limits_{i=0}^3{b}_i{x}^i \) be elements in GF(24), ai, bi ∈ GF(2), and we suppose that
We use two binary trees for inversions in subfield GF(24), which are illustrated as follows:
Each binary tree has four layers in GF(24);
Root nodes are on the third layer;
Each node has at most two child nodes, left node represents value of zero and right node represents value of one;
Each child must either be a leaf or the root of another tree, each node has a father node when it is not a root node;
Each element in a finite field (except (0000)2 and (0001)2) has a unique traversal from root to leaf due to the fact that (0000)2 has no inverse and the inverse of (0001)2 is itself;
Each leaf is linked to another leaf.
Figure 3 depicts the architecture for inversions in GF(24).
×
Example 1. It can be observed from Fig. 4, if it is required to inverse the element (0100)2, we search the binary tree from root nodes to leaf nodes, the path from n1 to n4 represents (0100)2. n4 is linked with n8, thus the path from n5 to n8 represents the inverse of (0100)2, i.e., (1001)2.
×
3.6 Parallel Gauss-Jordan eliminator
During central map transformation in signature generations, it is required to solve LSEs in a finite field twice with the same matrix size 9 × 9.
We adopt a parallel Gauss-Jordan elimination, which is depicted in Fig. 5. It can solve a LSE with matrix size of 9 × 9. The parallel Gauss-Jordan eliminator solves systems of linear equations with 9 iterations, which is enhanced in the following directions:
×
First, exclusive adders are used in the parallel Gauss-Jordan elimination based on the design described in Section 3.3;
Second, exclusive multipliers are used in the parallel Gauss-Jordan elimination based on the design described in Section 3.4;
Third, exclusive inverters are used in the parallel Gauss-Jordan elimination based on the design described in Section 3.5.
It can be observed from Fig. 4, I, Nl, and Ekl are three kinds of cells in the architecture, where k = 1, 2, …, 9 and l = 1, 2, …, 10.
The I cell is used for multiplicative inversion in a finite field, which includes an exclusive inverter described in Section 3.5.
The Nl cells are used for normalization of finite field elements, which includes exclusive multipliers described in Section 3.4.
The Ekl cells are used for elimination of finite field elements, which includes exclusive adders and multipliers described in Sections 3.3 and 3.4.
In conclusion, the architecture includes one I cell, 9 Nl cells, and 90 Elk cells and solves the LSEs within 9 clock cycles with the matrix size of 9 × 9.
4 Results
In this section, we investigate the performance of the high-speed hardware architecture for multivariate scheme through hardware implementations on an Altera Stratix FPGA. The implementation is programmed in the hardware programming language, Verilog.
The implementation of multivariate signature generations is illustrated in Table 2, where the executing time for a signature generation of enTTS is 0.9 μs, the time frequency is 100 MHz, and the clock cycle is 90. It should be noted that, all of the results from implementations mentioned are extracted after place and route on the Altera Stratix FPGA.
Table 2
FPGA implementation of hardware architecture for multivariate signature generation
Signature scheme
Message size
Signature size
L1 matrix
L2 matrix
Systems of linear equations
Clock cycle
Time frequency
Executing time
enTTS(20,28)
20 B
28 B
20 × 20
28 × 28
9 × 9
90
100 MHz
0.9 μs
We compare the high-speed hardware architecture with the related implementations of multivariate schemes and other public key schemes, which is depicted in Table 3. The ECC cryptosystems proposed in [2] are efficient implementations and Rainbow cryptosystems proposed in [38] are fast implementations.
It can be observed from Table 3 that the high-speed hardware architecture is much faster than the related implementations of public key cryptosystems.
5 Conclusions
We propose a high-speed cryptographic architecture for hardware implementation of multivariate signature generations in this paper. The main computations of signature generations of multivariate scheme are multiplications, inversions, and solving LSEs in a finite field. Thus, we improve the finite field multiplications via using composite field expression and design a finite field inversion via using binary trees. Besides, we improve solving LSEs in a finite field based on the variant algorithm of Gauss-Jordan elimination.
We implement the high-speed hardware architecture based on the above improvements on an Altera Stratix FPGA device. The implementation results show that the executing time for a signature generation of multivariate scheme is 0.9 μs, the time frequency is 100 MHz, and the clock cycle is 90. The comparison shows that the hardware architecture is much faster than other implementations.
Acknowledgements
The authors would like to express our cordial thanks to the section editor in charge and the respected reviewers for their time, accurate review of our manuscript, and their invaluable comments on this paper.
Funding
This work is supported by Shenzhen Science and Technology Program under Grant (nos. JCYJ20170306144219159 and JCYJ20160428092427867), Foundation for Distinguished Young Talents in Higher Education of Guangdong, China (no. 2017GkQNCX059), Special Fund for the Development of Strategic Emerging Industries and Future Industries of Shenzhen (no. 20170502142224600), Science and Technology Program of Shenzhen Polytechnic (no. 601722 K20018).
Authors’ information
Haibo Yi received the bachelor degree in computer science from Beijing Jiaotong University, China, in 2009 and the PhD from South China University of Technology, China in 2015. Since 2015, he has been with School of Computer Engineering of Shenzhen Polytechnic, as a lecturer. He has published over 20 technical papers. His main research areas are information security, cloud computing, and big data. He is a member of Chinese Association for Cryptologic Research.
Zhe Nie received the B.S. degree in industrial automation from the North China University of Technology, China and the M.S. degree in Computer application technology from Harbin Institute of Technology, China. He joined Shenzhen Polytechnic in 1994 and currently is a professor. His research interests include internet public opinion, artificial intelligence, and computer vision technology.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.