Skip to main content
Top
Published in: International Journal of Parallel Programming 1/2019

03-05-2018

Parallel SIMD CPU and GPU Implementations of Berlekamp–Massey Algorithm and Its Error Correction Application

Author: Hamidreza Mohebbi

Published in: International Journal of Parallel Programming | Issue 1/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The Berlekamp–Massey algorithm finds the shortest linear feedback shift register for a binary input sequence. A wide range of applications like cryptography and digital signal processing use this algorithm. This research proposes novel parallel mechanisms offered by heterogeneous CPU and GPU hardwares in order to achieve the best possible performance for BMA. The proposed bitwise implementation of the BMA algorithm is almost 35 times faster than state of the art implementations. This further improvement is achieved by using SIMD instructions which provides data level parallelism. This new approach can be 4.6 and 35 times faster than a bitwise CPU and state of the art implementations, respectively. In order to achieve the highest possible speedup over a multi-core structure, a multi-threading implementation is introduced in this research. By leveraging on OpenMP we were able to obtain a speedup of 10 times over 12 cores server. The GPU device with thousands of processing cores can bring great speedup over the best CPU implementation. Two other parallel mechanisms offered by GPU are concurrent kernel execution and streaming. They achieve 14.5 and 2.2 times of speedup compared to CPU serial and typical CUDA implementations, respectively. Also, the performance of the openMP code with SIMD instructions is compared with GPU stream implementation. The effectiveness of the proposed method is evaluated in a real world error correction application and it achieves 6.8 times of speedup.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ali, H., Ouyang, M., Soliman, A., Sheta, W.: Parallelizing the Berlekamp–Massey algorithm. Int. J. Comput. Sci. Inf. Secur. 13(11), 42 (2015) Ali, H., Ouyang, M., Soliman, A., Sheta, W.: Parallelizing the Berlekamp–Massey algorithm. Int. J. Comput. Sci. Inf. Secur. 13(11), 42 (2015)
3.
go back to reference Berlekamp, E.R.: Algebraic coding theory. McGraw-Hill, New York (1968) Berlekamp, E.R.: Algebraic coding theory. McGraw-Hill, New York (1968)
4.
go back to reference Chien, R.T.: Cyclic decoding procedures for Bose–Chaudhuri–Hocquenghem codes. IEEE Trans. Inf. Theory 10(4), 357–363 (1964)CrossRefMATH Chien, R.T.: Cyclic decoding procedures for Bose–Chaudhuri–Hocquenghem codes. IEEE Trans. Inf. Theory 10(4), 357–363 (1964)CrossRefMATH
5.
go back to reference Cowan, B., Cary, J., Meiser, D.: GPU acceleration of particle-in-cell methods. In: APS Meeting Abstracts (2017) Cowan, B., Cary, J., Meiser, D.: GPU acceleration of particle-in-cell methods. In: APS Meeting Abstracts (2017)
6.
go back to reference Ding, C., Xiao, G., Shan, W.: The Stability Theory of Stream Ciphers, vol. 561. Springer Science & Business Media, Berlin (1991)MATH Ding, C., Xiao, G., Shan, W.: The Stability Theory of Stream Ciphers, vol. 561. Springer Science & Business Media, Berlin (1991)MATH
9.
go back to reference Gille, P., Szamuely, T.: Central Simple Algebras and Galois Cohomology, vol. 165. Cambridge University Press, Cambridge (2017)CrossRefMATH Gille, P., Szamuely, T.: Central Simple Algebras and Galois Cohomology, vol. 165. Cambridge University Press, Cambridge (2017)CrossRefMATH
10.
go back to reference Guide, P.: Intel\(\textregistered \) 64 and IA-32 Architectures Software Developers Manual. Volume 3B: System Programming Guide, Part 2 (2011) Guide, P.: Intel\(\textregistered \) 64 and IA-32 Architectures Software Developers Manual. Volume 3B: System Programming Guide, Part 2 (2011)
11.
go back to reference Henkel, W.: Another description of the Berlekamp–Massey algorithm. IEE Proc. I (Commun. Speech Vis.) 136(3), 197–200 (1989)MathSciNetCrossRef Henkel, W.: Another description of the Berlekamp–Massey algorithm. IEE Proc. I (Commun. Speech Vis.) 136(3), 197–200 (1989)MathSciNetCrossRef
12.
go back to reference Ji, W., Zhang, W., Peng, X., Liu, Y.: High-efficient Reed–Solomon decoder design using recursive Berlekamp–Massey architecture. IET Commun. 10(4), 381–386 (2016)CrossRef Ji, W., Zhang, W., Peng, X., Liu, Y.: High-efficient Reed–Solomon decoder design using recursive Berlekamp–Massey architecture. IET Commun. 10(4), 381–386 (2016)CrossRef
13.
go back to reference Kirk, D.B., Wen-Mei, W.H.: Programming Massively Parallel Processors: A Hands-On Approach. Morgan Kaufmann, Los Altos (2016) Kirk, D.B., Wen-Mei, W.H.: Programming Massively Parallel Processors: A Hands-On Approach. Morgan Kaufmann, Los Altos (2016)
14.
go back to reference Kötter, R.: A fast parallel implementation of a Berlekamp–Massey algorithm for algebraic-geometric codes. IEEE Trans. Inf. Theory 44(4), 1353–1368 (1998)MathSciNetCrossRefMATH Kötter, R.: A fast parallel implementation of a Berlekamp–Massey algorithm for algebraic-geometric codes. IEEE Trans. Inf. Theory 44(4), 1353–1368 (1998)MathSciNetCrossRefMATH
16.
go back to reference Mérai, L., Niederreiter, H., Winterhof, A.: Expansion complexity and linear complexity of sequences over finite fields. Cryptogr. Commun. 9(4), 501–509 (2017)MathSciNetCrossRefMATH Mérai, L., Niederreiter, H., Winterhof, A.: Expansion complexity and linear complexity of sequences over finite fields. Cryptogr. Commun. 9(4), 501–509 (2017)MathSciNetCrossRefMATH
17.
go back to reference Mittal, S., Vetter, J.S.: A survey of CPU-GPU heterogeneous computing techniques. ACM Comput. Surv. (CSUR) 47(4), 69 (2015)CrossRef Mittal, S., Vetter, J.S.: A survey of CPU-GPU heterogeneous computing techniques. ACM Comput. Surv. (CSUR) 47(4), 69 (2015)CrossRef
18.
go back to reference Mohebbi, H., Mu, Y., Ding, W.: Learning weighted distance metric from group level information and its parallel implementation. Appl. Intell. 46(1), 180–196 (2017)CrossRef Mohebbi, H., Mu, Y., Ding, W.: Learning weighted distance metric from group level information and its parallel implementation. Appl. Intell. 46(1), 180–196 (2017)CrossRef
19.
go back to reference Mohebbi, H., Vajdi, A., Haspel, N., Simovici, D.: Detecting chromosomal structural variation using jaccard distance and parallel architecture. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 1959–1964. IEEE (2017) Mohebbi, H., Vajdi, A., Haspel, N., Simovici, D.: Detecting chromosomal structural variation using jaccard distance and parallel architecture. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 1959–1964. IEEE (2017)
20.
go back to reference Mohebbi, H.R., Kashefi, O., Sharifi, M.: Zivm: a zero-copy inter-vm communication mechanism for cloud computing. Comput. Inf. Sci. 4(6), 18 (2011) Mohebbi, H.R., Kashefi, O., Sharifi, M.: Zivm: a zero-copy inter-vm communication mechanism for cloud computing. Comput. Inf. Sci. 4(6), 18 (2011)
21.
go back to reference Moon Todd, K.: Error Correction Coding: Mathematical Methods and Algorithms. Technical Report, Wiley, ISBN:0-471-64800-0 (2005) Moon Todd, K.: Error Correction Coding: Mathematical Methods and Algorithms. Technical Report, Wiley, ISBN:0-471-64800-0 (2005)
22.
go back to reference Murase, M.: Linear feedback shift register. US Patent 5,090,035 (1992) Murase, M.: Linear feedback shift register. US Patent 5,090,035 (1992)
27.
go back to reference Pennycook, S.J., Hughes, C.J., Smelyanskiy, M., Jarvis, S.A.: Exploring SIMD for molecular dynamics, using intel\(\textregistered \) xeon\(\textregistered \) processors and intel\(\textregistered \) xeon phi coprocessors. In: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing (IPDPS), pp. 1085–1097. IEEE (2013) Pennycook, S.J., Hughes, C.J., Smelyanskiy, M., Jarvis, S.A.: Exploring SIMD for molecular dynamics, using intel\(\textregistered \) xeon\(\textregistered \) processors and intel\(\textregistered \) xeon phi coprocessors. In: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing (IPDPS), pp. 1085–1097. IEEE (2013)
28.
go back to reference Rajski, J., Tyszer, J.: Primitive polynomials over GF (2) of degree up to 660 with uniformly distributed coefficients. J. Electron. Test. 19(6), 645–657 (2003)CrossRef Rajski, J., Tyszer, J.: Primitive polynomials over GF (2) of degree up to 660 with uniformly distributed coefficients. J. Electron. Test. 19(6), 645–657 (2003)CrossRef
30.
go back to reference Rueppel, R.A.: Linear complexity and random sequences. In: Advances in Cryptology EUROCRYPT85, pp. 167–188. Springer (1986) Rueppel, R.A.: Linear complexity and random sequences. In: Advances in Cryptology EUROCRYPT85, pp. 167–188. Springer (1986)
31.
go back to reference Stamp, M., Martin, C.F.: An algorithm for the k-error linear complexity of binary sequences with period 2 n. IEEE Trans. Inf. Theory 39(4), 1398–1401 (1993)MathSciNetCrossRefMATH Stamp, M., Martin, C.F.: An algorithm for the k-error linear complexity of binary sequences with period 2 n. IEEE Trans. Inf. Theory 39(4), 1398–1401 (1993)MathSciNetCrossRefMATH
Metadata
Title
Parallel SIMD CPU and GPU Implementations of Berlekamp–Massey Algorithm and Its Error Correction Application
Author
Hamidreza Mohebbi
Publication date
03-05-2018
Publisher
Springer US
Published in
International Journal of Parallel Programming / Issue 1/2019
Print ISSN: 0885-7458
Electronic ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-018-0574-x

Other articles of this Issue 1/2019

International Journal of Parallel Programming 1/2019 Go to the issue

Premium Partner