Skip to main content
Erschienen in: The Journal of Supercomputing 3/2022

24.08.2021

Implementation and optimization of ChaCha20 stream cipher on sunway taihuLight supercomputer

verfasst von: Weilin Cai, Heng Chen, Ziheng Wang, Xingjun Zhang

Erschienen in: The Journal of Supercomputing | Ausgabe 3/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data have always been the most valuable asset of enterprises and research institutions, and their confidentiality, especially the input and output data related to applications running on remote supercomputers, should be protected as much as possible. However, because of the large scale of the data, it takes a considerable amount of time to encrypt and decrypt them. The ChaCha20 cipher and the Advanced Encryption Standard (AES) cipher are the only ciphers supported by TLS v1.3. The ChaCha20 cipher is a kind of high-speed stream cipher emerging in recent years, which has attracted more and more attention due to its security and high efficiency. In order to make large-scale data en-/decryption more efficient, we implement a parallel version of the ChaCha20 stream cipher, parallel ChaCha20, which is optimized for SW26010 heterogeneous multi-core processor on the Sunway TaihuLight supercomputer. We used multiple optimization methods such as Direct Memory Access (DMA) and Single Instruction Multiple Data (SIMD) supported by SW26010 and proposed an optimization scheme that dynamically changes with the size of input data. The experiment results show that the parallel ChaCha20 has a maximum throughput of 32.43 GB/s on a single SW26010 processor, which is 2.4 times that of the best AES implementation on Sunway as far as we know. Moreover, the parallel ChaCha20 has a good scalability and runs on 1024 core groups with a max throughput of 8296.43 GB/s.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat At N, Beuchat J, Okamoto E, San I, Yamazaki T (2014) Compact hardware implementations of chacha, blake, threefish, and skein on FPGA. IEEE Trans Circuits Syst I Regul Pap 61(2):485–498CrossRef At N, Beuchat J, Okamoto E, San I, Yamazaki T (2014) Compact hardware implementations of chacha, blake, threefish, and skein on FPGA. IEEE Trans Circuits Syst I Regul Pap 61(2):485–498CrossRef
2.
Zurück zum Zitat Aumasson J, Fischer S, Khazaei S, Meier W, Rechberger C (2008) New features of latin dances: Analysis of salsa, chacha, and rumba. In: Nyberg K (ed) Fast Software Encryption, 15th International Workshop, FSE 2008, Lausanne, Switzerland, 2008, Revised Selected Papers, Springer, Lecture Notes in Computer Science, vol 5086, pp 470–488 Aumasson J, Fischer S, Khazaei S, Meier W, Rechberger C (2008) New features of latin dances: Analysis of salsa, chacha, and rumba. In: Nyberg K (ed) Fast Software Encryption, 15th International Workshop, FSE 2008, Lausanne, Switzerland, 2008, Revised Selected Papers, Springer, Lecture Notes in Computer Science, vol 5086, pp 470–488
3.
Zurück zum Zitat Bernstein D (2008a) Chacha, a variant of salsa20. In: Workshop Record of SASC, pp 3–5 Bernstein D (2008a) Chacha, a variant of salsa20. In: Workshop Record of SASC, pp 3–5
4.
Zurück zum Zitat Bernstein DJ (2008b) The salsa20 family of stream ciphers. In: Robshaw MJB, Billet O (eds) New stream cipher designs - the eSTREAM finalists, vol 4986. Lecture notes in computer science. Springer, Berlin, pp 84–97CrossRef Bernstein DJ (2008b) The salsa20 family of stream ciphers. In: Robshaw MJB, Billet O (eds) New stream cipher designs - the eSTREAM finalists, vol 4986. Lecture notes in computer science. Springer, Berlin, pp 84–97CrossRef
5.
Zurück zum Zitat Chen Y, Li K, Fei X, Quan Z, Li K (2016) Implementation and optimization of AES algorithm on the sunway taihulight. In: Shen H, Sang Y, Tian H (eds) 17th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2016, Guangzhou, China, 2016, IEEE Computer Society, pp. 256–261 Chen Y, Li K, Fei X, Quan Z, Li K (2016) Implementation and optimization of AES algorithm on the sunway taihulight. In: Shen H, Sang Y, Tian H (eds) 17th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2016, Guangzhou, China, 2016, IEEE Computer Society, pp. 256–261
6.
Zurück zum Zitat Chen Y, Li K, Fei X, Quan Z, Li K (2019) Implementation and optimization of a data protecting model on the sunway taihulight supercomputer with heterogeneous many-core processors. Concurr Comput Pract Exp 31(21):e4758CrossRef Chen Y, Li K, Fei X, Quan Z, Li K (2019) Implementation and optimization of a data protecting model on the sunway taihulight supercomputer with heterogeneous many-core processors. Concurr Comput Pract Exp 31(21):e4758CrossRef
7.
Zurück zum Zitat Dey S, Sarkar S (2020) Proving the biases of salsa and chacha in differential attack. Des Codes Cryptogr 88(9):1827–1856MathSciNetCrossRef Dey S, Sarkar S (2020) Proving the biases of salsa and chacha in differential attack. Des Codes Cryptogr 88(9):1827–1856MathSciNetCrossRef
9.
Zurück zum Zitat Dongarra J (2016) Report on the sunway taihulight system. PDF) www netlib org Retrieved Dongarra J (2016) Report on the sunway taihulight system. PDF) www netlib org Retrieved
10.
Zurück zum Zitat Fu H, Liao J, Yang J, Wang L, Song Z, Huang X, Yang C, Xue W, Liu F, Qiao F, Zhao W, Yin X, Hou C, Zhang C, Ge W, Zhang J, Wang Y, Zhou C, Yang G (2016) The sunway taihulight supercomputer: system and applications. Sci China. Inf Sci 59(7):072001:1–072001:16(7):072001:1-072001:16 Fu H, Liao J, Yang J, Wang L, Song Z, Huang X, Yang C, Xue W, Liu F, Qiao F, Zhao W, Yin X, Hou C, Zhang C, Ge W, Zhang J, Wang Y, Zhou C, Yang G (2016) The sunway taihulight supercomputer: system and applications. Sci China. Inf Sci 59(7):072001:1–072001:16(7):072001:1-072001:16
11.
Zurück zum Zitat Fu H, Liao J, Ding N, Duan X, Gan L, Liang Y, Wang X, Yang J, Zheng Y, Liu W, Wang L, Yang G (2017) Redesigning CAM-SE for peta-scale climate modeling performance and ultra-high resolution on sunway taihulight. In: Mohr B, Raghavan P (eds) Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017, Denver, CO, USA, 2017, ACM, pp. 1:1–1:12 Fu H, Liao J, Ding N, Duan X, Gan L, Liang Y, Wang X, Yang J, Zheng Y, Liu W, Wang L, Yang G (2017) Redesigning CAM-SE for peta-scale climate modeling performance and ultra-high resolution on sunway taihulight. In: Mohr B, Raghavan P (eds) Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017, Denver, CO, USA, 2017, ACM, pp. 1:1–1:12
12.
Zurück zum Zitat Goll M, Gueron S (2014) Vectorization on chacha stream cipher. In: Latifi S (ed) 11th International Conference on Information Technology: New Generations, ITNG 2014, Las Vegas, NV, USA, 2014, IEEE Computer Society, pp. 612–615 Goll M, Gueron S (2014) Vectorization on chacha stream cipher. In: Latifi S (ed) 11th International Conference on Information Technology: New Generations, ITNG 2014, Las Vegas, NV, USA, 2014, IEEE Computer Society, pp. 612–615
13.
Zurück zum Zitat He L, An H, Yang C, Wang F, Chen J, Wang C, Liang W, Dong S, Sun Q, Han W, Liu W, Han Y, Yao W (2018) PEPS++: towards extreme-scale simulations of strongly correlated quantum many-particle models on sunway taihulight. IEEE Trans Parallel Distrib Syst 29(12):2838–2848CrossRef He L, An H, Yang C, Wang F, Chen J, Wang C, Liang W, Dong S, Sun Q, Han W, Liu W, Han Y, Yao W (2018) PEPS++: towards extreme-scale simulations of strongly correlated quantum many-particle models on sunway taihulight. IEEE Trans Parallel Distrib Syst 29(12):2838–2848CrossRef
14.
Zurück zum Zitat III BD, Gunawi HS, Feldman AJ, Hoffmann H (2018) Strongbox: Confidentiality, integrity, and performance using stream ciphers for full drive encryption. In: Shen X, Tuck J, Bianchini R, Sarkar V (eds) Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018, Williamsburg, VA, USA, 2018, ACM, pp. 708–721 III BD, Gunawi HS, Feldman AJ, Hoffmann H (2018) Strongbox: Confidentiality, integrity, and performance using stream ciphers for full drive encryption. In: Shen X, Tuck J, Bianchini R, Sarkar V (eds) Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018, Williamsburg, VA, USA, 2018, ACM, pp. 708–721
15.
Zurück zum Zitat Isobe T, Ohigashi T, Watanabe Y, Morii M (2013) Full plaintext recovery attack on broadcast RC4. In: Moriai S (ed) Fast Software Encryption - 20th International Workshop, FSE 2013, Singapore, 2013. Revised Selected Papers, Springer, Lecture Notes in Computer Science, vol 8424, pp. 179–202 Isobe T, Ohigashi T, Watanabe Y, Morii M (2013) Full plaintext recovery attack on broadcast RC4. In: Moriai S (ed) Fast Software Encryption - 20th International Workshop, FSE 2013, Singapore, 2013. Revised Selected Papers, Springer, Lecture Notes in Computer Science, vol 8424, pp. 179–202
16.
Zurück zum Zitat Kumar SVD, Patranabis S, Breier J, Mukhopadhyay D, Bhasin S, Chattopadhyay A, Baksi A (2017) A practical fault attack on arx-like ciphers with a case study on chacha20. In: 2017 Workshop on Fault Diagnosis and Tolerance in Cryptography, FDTC 2017, Taipei, Taiwan, 2017, IEEE Computer Society, pp. 33–40 Kumar SVD, Patranabis S, Breier J, Mukhopadhyay D, Bhasin S, Chattopadhyay A, Baksi A (2017) A practical fault attack on arx-like ciphers with a case study on chacha20. In: 2017 Workshop on Fault Diagnosis and Tolerance in Cryptography, FDTC 2017, Taipei, Taiwan, 2017, IEEE Computer Society, pp. 33–40
17.
Zurück zum Zitat Li L, Fang J, Jiang J, Gan L, Zheng W, Fu H, Yang G (2017) SW-AES: accelerating AES algorithm on the sunway taihulight. In: 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), Guangzhou, China, 2017, IEEE, pp 1204–1211 Li L, Fang J, Jiang J, Gan L, Zheng W, Fu H, Yang G (2017) SW-AES: accelerating AES algorithm on the sunway taihulight. In: 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), Guangzhou, China, 2017, IEEE, pp 1204–1211
18.
Zurück zum Zitat Li L, Fang J, Jiang J, Gan L, Zheng W, Fu H, Yang G (2020) Efficient AES implementation on sunway taihulight supercomputer: a systematic approach. J Parallel Distrib Comput 138:178–189CrossRef Li L, Fang J, Jiang J, Gan L, Zheng W, Fu H, Yang G (2020) Efficient AES implementation on sunway taihulight supercomputer: a systematic approach. J Parallel Distrib Comput 138:178–189CrossRef
21.
Zurück zum Zitat McLaren P, Buchanan WJ, Russell G, Tan Z (2019) Deriving chacha20 key streams from targeted memory analysis. J Inf Secur Appl 48:102372 McLaren P, Buchanan WJ, Russell G, Tan Z (2019) Deriving chacha20 key streams from targeted memory analysis. J Inf Secur Appl 48:102372
22.
Zurück zum Zitat Nir Y, Langley A (2018) Chacha20 and poly1305 for IETF protocols. RFC 8439:1–46 Nir Y, Langley A (2018) Chacha20 and poly1305 for IETF protocols. RFC 8439:1–46
23.
Zurück zum Zitat Pfau J, Reuter M, Harbaum T, Hofmann K, Becker J (2019) A hardware perspective on the chacha ciphers: scalable chacha8/12/20 implementations ranging from 476 slices to bitrates of 175 gbit/s. In: 32nd IEEE International System-on-Chip Conference, SOCC 2019, Singapore, 2019, IEEE, pp. 294–299 Pfau J, Reuter M, Harbaum T, Hofmann K, Becker J (2019) A hardware perspective on the chacha ciphers: scalable chacha8/12/20 implementations ranging from 476 slices to bitrates of 175 gbit/s. In: 32nd IEEE International System-on-Chip Conference, SOCC 2019, Singapore, 2019, IEEE, pp. 294–299
24.
Zurück zum Zitat Rescorla E (2018) The transport layer security (TLS) protocol version 1.3. RFC 8446:1–160 Rescorla E (2018) The transport layer security (TLS) protocol version 1.3. RFC 8446:1–160
25.
Zurück zum Zitat Shi Z, Zhang B, Feng D, Wu W (2012) Improved key recovery attacks on reduced-round salsa20 and chacha. In: Kwon T, Lee M, Kwon D (eds) Information Security and Cryptology - ICISC 2012 - 15th International Conference, Seoul, Korea, 2012, Revised Selected Papers, Springer, Lecture Notes in Computer Science, vol 7839, pp. 337–351 Shi Z, Zhang B, Feng D, Wu W (2012) Improved key recovery attacks on reduced-round salsa20 and chacha. In: Kwon T, Lee M, Kwon D (eds) Information Security and Cryptology - ICISC 2012 - 15th International Conference, Seoul, Korea, 2012, Revised Selected Papers, Springer, Lecture Notes in Computer Science, vol 7839, pp. 337–351
26.
Zurück zum Zitat Silitonga A, Schade F, Jiang G, Becker J (2018) Hls-based performance and resource optimization of cryptographic modules. In: Chen J, Yang LT (eds) IEEE International Conference on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications, ISPA/IUCC/BDCloud/SocialCom/SustainCom 2018, Melbourne, Australia, 2018, IEEE, pp. 1009–1016 Silitonga A, Schade F, Jiang G, Becker J (2018) Hls-based performance and resource optimization of cryptographic modules. In: Chen J, Yang LT (eds) IEEE International Conference on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications, ISPA/IUCC/BDCloud/SocialCom/SustainCom 2018, Melbourne, Australia, 2018, IEEE, pp. 1009–1016
27.
Zurück zum Zitat Soltani A, Sharifian S (2015) An ultra-high throughput and fully pipelined implementation of AES algorithm on FPGA. Microprocess Microsyst 39(7):480–493CrossRef Soltani A, Sharifian S (2015) An ultra-high throughput and fully pipelined implementation of AES algorithm on FPGA. Microprocess Microsyst 39(7):480–493CrossRef
28.
Zurück zum Zitat Sun S, Zhang R, Ma H (2020) Efficient parallelism of post-quantum signature scheme SPHINCS. IEEE Trans Parallel Distrib Syst 31(11):2542–2555CrossRef Sun S, Zhang R, Ma H (2020) Efficient parallelism of post-quantum signature scheme SPHINCS. IEEE Trans Parallel Distrib Syst 31(11):2542–2555CrossRef
29.
Zurück zum Zitat Velea R, Gurzau F, Margarit L, Bica I, Patriciu VV (2016) Performance of parallel chacha20 stream cipher. In: 11th IEEE International Symposium on Applied Computational Intelligence and Informatics, SACI 2016, Timisoara, Romania, 2016, IEEE, pp 391–396 Velea R, Gurzau F, Margarit L, Bica I, Patriciu VV (2016) Performance of parallel chacha20 stream cipher. In: 11th IEEE International Symposium on Applied Computational Intelligence and Informatics, SACI 2016, Timisoara, Romania, 2016, IEEE, pp 391–396
30.
Zurück zum Zitat Xiao Z, Liu X, Xu J, Sun Q, Gan L (2021) Highly scalable parallel genetic algorithm on sunway many-core processors. Future Gener Comput Syst 114:679–691CrossRef Xiao Z, Liu X, Xu J, Sun Q, Gan L (2021) Highly scalable parallel genetic algorithm on sunway many-core processors. Future Gener Comput Syst 114:679–691CrossRef
31.
Zurück zum Zitat Xu Z, Lin J, Matsuoka S (2017) Benchmarking SW26010 many-core processor. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPS Workshops 2017, Orlando / Buena Vista, FL, USA, 2017, IEEE Computer Society, pp. 743–752 Xu Z, Lin J, Matsuoka S (2017) Benchmarking SW26010 many-core processor. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPS Workshops 2017, Orlando / Buena Vista, FL, USA, 2017, IEEE Computer Society, pp. 743–752
Metadaten
Titel
Implementation and optimization of ChaCha20 stream cipher on sunway taihuLight supercomputer
verfasst von
Weilin Cai
Heng Chen
Ziheng Wang
Xingjun Zhang
Publikationsdatum
24.08.2021
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 3/2022
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-021-04023-9

Weitere Artikel der Ausgabe 3/2022

The Journal of Supercomputing 3/2022 Zur Ausgabe

Premium Partner