Skip to main content

2014 | OriginalPaper | Buchkapitel

Montgomery Multiplication Using Vector Instructions

verfasst von : Joppe W. Bos, Peter L. Montgomery, Daniel Shumow, Gregory M. Zaverucha

Erschienen in: Selected Areas in Cryptography -- SAC 2013

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper we present a parallel approach to compute interleaved Montgomery multiplication. This approach is particularly suitable to be computed on 2-way single instruction, multiple data platforms as can be found on most modern computer architectures in the form of vector instruction set extensions. We have implemented this approach for tablet devices which run the x86 architecture (Intel Atom Z2760) using SSE2 instructions as well as devices which run on the ARM platform (Qualcomm MSM8960, NVIDIA Tegra 3 and 4) using NEON instructions. When instantiating modular exponentiation with this parallel version of Montgomery multiplication we observed a performance increase of more than a factor of 1.5 compared to the sequential implementation in OpenSSL for the classical arithmetic logic unit on the Atom platform for 2048-bit moduli.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
These were the newest versions available for each architecture at the time of writing.
 
2
Results of the NEON vmull/vmlal instructions are available after 7 cycles, while the two 32-bit outputs of the ARM umaal instruction become ready after 4 and 5 cycles [1, 2].
 
Literatur
1.
Zurück zum Zitat ARM. Cortex-A9. Technical Reference Manual (2010). Version r2p2 ARM. Cortex-A9. Technical Reference Manual (2010). Version r2p2
2.
Zurück zum Zitat ARM. Cortex-A9 NEON Media Processing Engine. Technical Reference Manual (2012). Version r4p1 ARM. Cortex-A9 NEON Media Processing Engine. Technical Reference Manual (2012). Version r4p1
3.
Zurück zum Zitat ARM Limited. ARM Architechture Reference Manual ARMv7-A and ARMv7-R edition (2010) ARM Limited. ARM Architechture Reference Manual ARMv7-A and ARMv7-R edition (2010)
4.
Zurück zum Zitat Bajard, J.-C., Didier, L.-S., Kornerup, P.: An RNS Montgomery modular multiplication algorithm. IEEE Trans. Comput. 47(7), 766–776 (1998)CrossRefMathSciNet Bajard, J.-C., Didier, L.-S., Kornerup, P.: An RNS Montgomery modular multiplication algorithm. IEEE Trans. Comput. 47(7), 766–776 (1998)CrossRefMathSciNet
6.
Zurück zum Zitat Bernstein, D.J., Schwabe, P.: NEON crypto. In: Prouff, E., Schaumont, P. (eds.) CHES 2012. LNCS, vol. 7428, pp. 320–339. Springer, Heidelberg (2012) Bernstein, D.J., Schwabe, P.: NEON crypto. In: Prouff, E., Schaumont, P. (eds.) CHES 2012. LNCS, vol. 7428, pp. 320–339. Springer, Heidelberg (2012)
7.
Zurück zum Zitat Bos, J.W.: High-performance modular multiplication on the cell processor. In: Hasan, M.A., Helleseth, T. (eds.) WAIFI 2010. LNCS, vol. 6087, pp. 7–24. Springer, Heidelberg (2010) Bos, J.W.: High-performance modular multiplication on the cell processor. In: Hasan, M.A., Helleseth, T. (eds.) WAIFI 2010. LNCS, vol. 6087, pp. 7–24. Springer, Heidelberg (2010)
8.
Zurück zum Zitat Bos, J.W., Kaihara, M.E.: Montgomery multiplication on the cell. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds.) PPAM 2009, Part I. LNCS, vol. 6067, pp. 477–485. Springer, Heidelberg (2010) Bos, J.W., Kaihara, M.E.: Montgomery multiplication on the cell. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds.) PPAM 2009, Part I. LNCS, vol. 6067, pp. 477–485. Springer, Heidelberg (2010)
9.
Zurück zum Zitat Comba, P.G.: Exponentiation cryptosystems on the IBM PC. IBM Syst. J. 29(4), 526–538 (1990)CrossRef Comba, P.G.: Exponentiation cryptosystems on the IBM PC. IBM Syst. J. 29(4), 526–538 (1990)CrossRef
10.
Zurück zum Zitat Dixon, B., Lenstra, A.K.: Massively parallel elliptic curve factoring. In: Rueppel, R.A. (ed.) EUROCRYPT 1992. LNCS, vol. 658, pp. 183–193. Springer, Heidelberg (1993) Dixon, B., Lenstra, A.K.: Massively parallel elliptic curve factoring. In: Rueppel, R.A. (ed.) EUROCRYPT 1992. LNCS, vol. 658, pp. 183–193. Springer, Heidelberg (1993)
11.
Zurück zum Zitat ElGamal, T.: A public key cryptosystem and a signature scheme based on discrete logarithms. In: Blakley, G., Chaum, D. (eds.) CRYPTO 1984. LNCS, vol. 196, pp. 10–18. Springer, Heidelberg (1985) ElGamal, T.: A public key cryptosystem and a signature scheme based on discrete logarithms. In: Blakley, G., Chaum, D. (eds.) CRYPTO 1984. LNCS, vol. 196, pp. 10–18. Springer, Heidelberg (1985)
12.
Zurück zum Zitat Faz-Hernandez, A., Longa, P., Sanchez, A.H.: Efficient and secure algorithms for GLV-based scalar multiplication and their implementation on GLV-GLS curves. Cryptology ePrint Archive, Report 2013/158 (2013). http://eprint.iacr.org/. CT\_RSA. doi:10.1007/978-3-319-04852-9_1 Faz-Hernandez, A., Longa, P., Sanchez, A.H.: Efficient and secure algorithms for GLV-based scalar multiplication and their implementation on GLV-GLS curves. Cryptology ePrint Archive, Report 2013/158 (2013). http://​eprint.​iacr.​org/​. CT\_RSA. doi:10.1007/978-3-319-04852-9_1
14.
Zurück zum Zitat Garner, H.L.: The residue number system. IRE Trans. Electron. Comput. 8, 140–147 (1959)CrossRef Garner, H.L.: The residue number system. IRE Trans. Electron. Comput. 8, 140–147 (1959)CrossRef
15.
Zurück zum Zitat Grabher, P., Großschädl, J., Page, D.: On software parallel implementation of cryptographic pairings. In: Avanzi, R.M., Keliher, L., Sica, F. (eds.) SAC 2008. LNCS, vol. 5381, pp. 35–50. Springer, Heidelberg (2009) Grabher, P., Großschädl, J., Page, D.: On software parallel implementation of cryptographic pairings. In: Avanzi, R.M., Keliher, L., Sica, F. (eds.) SAC 2008. LNCS, vol. 5381, pp. 35–50. Springer, Heidelberg (2009)
16.
Zurück zum Zitat Großschädl, J.: Architectural support for long integer modulo arithmetic on RISC-based smart cards. Int. J. High Perform. Comput. Appl. - IJHPCA 17(2), 135–146 (2003)CrossRef Großschädl, J.: Architectural support for long integer modulo arithmetic on RISC-based smart cards. Int. J. High Perform. Comput. Appl. - IJHPCA 17(2), 135–146 (2003)CrossRef
17.
Zurück zum Zitat Großschädl, J., Avanzi, R.M., Savaş, E., Tillich, S.: Energy-efficient software implementation of long integer modular arithmetic. In: Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 75–90. Springer, Heidelberg (2005) Großschädl, J., Avanzi, R.M., Savaş, E., Tillich, S.: Energy-efficient software implementation of long integer modular arithmetic. In: Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 75–90. Springer, Heidelberg (2005)
18.
Zurück zum Zitat Gueron, S., Krasnov, V.: Software implementation of modular exponentiation, using advanced vector instructions architectures. In: Özbudak, F., Rodríguez-Henríquez, F. (eds.) WAIFI 2012. LNCS, vol. 7369, pp. 119–135. Springer, Heidelberg (2012) Gueron, S., Krasnov, V.: Software implementation of modular exponentiation, using advanced vector instructions architectures. In: Özbudak, F., Rodríguez-Henríquez, F. (eds.) WAIFI 2012. LNCS, vol. 7369, pp. 119–135. Springer, Heidelberg (2012)
19.
Zurück zum Zitat Harrison, O., Waldron, J.: Efficient acceleration of asymmetric cryptography on graphics hardware. In: Preneel, B. (ed.) AFRICACRYPT 2009. LNCS, vol. 5580, pp. 350–367. Springer, Heidelberg (2009) CrossRef Harrison, O., Waldron, J.: Efficient acceleration of asymmetric cryptography on graphics hardware. In: Preneel, B. (ed.) AFRICACRYPT 2009. LNCS, vol. 5580, pp. 350–367. Springer, Heidelberg (2009) CrossRef
20.
Zurück zum Zitat Holz, R., Braun, L., Kammenhuber, N., Carle, G.: The SSL landscape: a thorough analysis of the x.509 PKI using active and passive measurements. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, IMC ’11, pp. 427–444. ACM (2011) Holz, R., Braun, L., Kammenhuber, N., Carle, G.: The SSL landscape: a thorough analysis of the x.509 PKI using active and passive measurements. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, IMC ’11, pp. 427–444. ACM (2011)
23.
Zurück zum Zitat Iwamura, K., Matsumoto, T., Imai, H.: Systolic-arrays for modular exponentiation using montgomery method. In: Rueppel, R.A. (ed.) EUROCRYPT 1992. LNCS, vol. 658, pp. 477–481. Springer, Heidelberg (1993) Iwamura, K., Matsumoto, T., Imai, H.: Systolic-arrays for modular exponentiation using montgomery method. In: Rueppel, R.A. (ed.) EUROCRYPT 1992. LNCS, vol. 658, pp. 477–481. Springer, Heidelberg (1993)
24.
Zurück zum Zitat Kaihara, M.E., Takagi, N.: Bipartite modular multiplication method. IEEE Trans. Comput. 57(2), 157–164 (2008)CrossRefMathSciNet Kaihara, M.E., Takagi, N.: Bipartite modular multiplication method. IEEE Trans. Comput. 57(2), 157–164 (2008)CrossRefMathSciNet
25.
Zurück zum Zitat Karatsuba, A.A., Ofman, Y.: Multiplication of many-digital numbers by automatic computers. Proc. USSR Acad. Sci. 145, 293–294 (1962) Karatsuba, A.A., Ofman, Y.: Multiplication of many-digital numbers by automatic computers. Proc. USSR Acad. Sci. 145, 293–294 (1962)
26.
Zurück zum Zitat Koc, K., Acar, T., Kaliski Jr, B.S.: Analyzing and comparing montgomery multiplication algorithms. IEEE Micro 16(3), 26–33 (1996)CrossRef Koc, K., Acar, T., Kaliski Jr, B.S.: Analyzing and comparing montgomery multiplication algorithms. IEEE Micro 16(3), 26–33 (1996)CrossRef
27.
Zurück zum Zitat Kocher, P.C.: Timing attacks on implementations of Diffie-Hellman, RSA, DSS and other systems. In: Koblitz, N. (ed.) CRYPTO 1996. LNCS, vol. 1109, pp. 104–113. Springer, Heidelberg (1996) Kocher, P.C.: Timing attacks on implementations of Diffie-Hellman, RSA, DSS and other systems. In: Koblitz, N. (ed.) CRYPTO 1996. LNCS, vol. 1109, pp. 104–113. Springer, Heidelberg (1996)
28.
Zurück zum Zitat Lenstra, A.K., Hughes, J.P., Augier, M., Bos, J.W., Kleinjung, T., Wachter, C.: Public keys. In: Safavi-Naini, R., Canetti, R. (eds.) CRYPTO 2012. LNCS, vol. 7417, pp. 626–642. Springer, Heidelberg (2012) CrossRef Lenstra, A.K., Hughes, J.P., Augier, M., Bos, J.W., Kleinjung, T., Wachter, C.: Public keys. In: Safavi-Naini, R., Canetti, R. (eds.) CRYPTO 2012. LNCS, vol. 7417, pp. 626–642. Springer, Heidelberg (2012) CrossRef
29.
Zurück zum Zitat Merrill, R.D.: Improving digital computer performance using residue number theory. IEEE Trans. Electron. Comput. EC–13(2), 93–101 (1964)CrossRef Merrill, R.D.: Improving digital computer performance using residue number theory. IEEE Trans. Electron. Comput. EC–13(2), 93–101 (1964)CrossRef
30.
Zurück zum Zitat Montgomery, P.L.: Modular multiplication without trial division. Math. Comput. 44(170), 519–521 (1985)CrossRefMATH Montgomery, P.L.: Modular multiplication without trial division. Math. Comput. 44(170), 519–521 (1985)CrossRefMATH
32.
Zurück zum Zitat OpenSSL. The open source toolkit for SSL/TLS (2013) OpenSSL. The open source toolkit for SSL/TLS (2013)
33.
Zurück zum Zitat Page, D., Smart, N.P.: Parallel cryptographic arithmetic using a redundant Montgomery representation. IEEE Trans. Comput. 53(11), 1474–1482 (2004)CrossRef Page, D., Smart, N.P.: Parallel cryptographic arithmetic using a redundant Montgomery representation. IEEE Trans. Comput. 53(11), 1474–1482 (2004)CrossRef
34.
Zurück zum Zitat Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21, 120–126 (1978)CrossRefMATHMathSciNet Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21, 120–126 (1978)CrossRefMATHMathSciNet
35.
Zurück zum Zitat Sánchez, A.H., Rodríguez-Henríquez, F.: NEON implementation of an attribute-based encryption scheme. In: Jacobson, M., Locasto, M., Mohassel, P., Safavi-Naini, R. (eds.) ACNS 2013. LNCS, vol. 7954, pp. 322–338. Springer, Heidelberg (2013) Sánchez, A.H., Rodríguez-Henríquez, F.: NEON implementation of an attribute-based encryption scheme. In: Jacobson, M., Locasto, M., Mohassel, P., Safavi-Naini, R. (eds.) ACNS 2013. LNCS, vol. 7954, pp. 322–338. Springer, Heidelberg (2013)
Metadaten
Titel
Montgomery Multiplication Using Vector Instructions
verfasst von
Joppe W. Bos
Peter L. Montgomery
Daniel Shumow
Gregory M. Zaverucha
Copyright-Jahr
2014
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-43414-7_24