1 Introduction
1.1 Related work and motivation
1.2 Contributions
-
We propose a new integer representation to optimize the implementation of modular multiplication using the characteristic of modulo prime which has the term “− 1.” In this regard, we choose the 192-bit NIST standard prime, which has such characteristic and suitable for constrained devices.
-
On the basis of the new integer representation, we present a novel approach for the 192-bit modular multiplication over the 192-bit NIST prime for 8-bit architectures. By merging the reduction operations into the subtractive Karatsuba multiplication on the new integer representation, we optimize the intermediate accumulation in the modular multiplication. Our merged modular multiplication has two duplicated groups of 96-bit intermediate results during accumulation. Hence, only one accumulation of the group is required and the result can be used twice. Consequently, we significantly reduce the number of load/store instructions as well as that of addition instructions.
-
We present the implementation result of our proposed 192-bit modulo multiplication over the 192-bit NIST prime on an 8-bit AVR ATmega microcontrollers. The result of our work takes only 2888 clock cycles, which is 17% faster than the previous best record of modular multiplication by Liu et al. [3]. In addition, our modular multiplication is even faster than Hutter’s subtractive Karatsuba multiplication (without reduction) [13] which achieved a speed record for multiplication on AVR processor.
2 Preliminaries
2.1 Elliptic curve cryptography
2.2 Multi-precision multiplication techniques
2.2.1 Operand scanning method
2.2.2 Product scanning method
2.2.3 Hybrid scanning method
2.2.4 Operand caching method
2.2.5 Subtractive Karatsuba method
3 Proposed modular multiplication
3.1 Range shifted representation
3.2 Modular multiplication with RSR
3.3 Implementation of modular multiplication with RSR
3.3.1 96-Bit 1-level Karatsuba multiplication
3.3.2 Modified 96-bit 1-level Karatsuba multiplication for \(L^{(2)}\)
3.3.3 192-Bit 2-level Karatsuba multiplication with reduction
4 Result
Approach | Including reduction | Cycle counts |
---|---|---|
Operand scanning\(^\mathrm{a}\) | X | 7760 |
Product scanning\(^\mathrm{a}\) | X | 5614 |
Hybrid scanning\(^\mathrm{a}\) [1] | X | 4133 |
Operand caching\(^\mathrm{a}\) [10] | X | 3470 |
Consecutive operand caching\(^\mathrm{a}\) [11] | X | 3437 |
Subtractive Karatsuba\(^\mathrm{a}\) [13] | X | 2987 |
Consecutive operand caching\(^\mathrm{b}\) [3] | O | 4042 |
Subtractive Karatsuba\(^\mathrm{b}\) [3] | O | 3597 |
This paper\(^\mathrm{b}\) | O | 2958 |