Skip to main content
Top
Published in: The Journal of Supercomputing 7/2018

02-05-2018

Fast induced sorting suffixes on a multicore machine

Authors: Bin Lao, Ge Nong, Wai Hong Chan, Yi Pan

Published in: The Journal of Supercomputing | Issue 7/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Sorting the suffixes of an input string is a fundamental task in many applications such as data compression, genome alignment, and full-text search. The induced sorting (IS) method has been successfully applied to design a number of state-of-the-art suffix sorting algorithms. In particular, the SAIS algorithm designed by the IS method is not only linear in time but also fast in practice. However, the parallelization of SAIS remains a challenge due to that the IS process in the algorithm is inherently sequential and the performance bottleneck of the whole algorithm. This article presents our attempt for designing a parallel variant of SAIS on a multicore machine which is considered as a shared memory parallel model, called pSAIS. By a combined use of multithreading and pipelining, the inducing process is accelerated by fully utilizing the machine’s parallel computing power. An experimental study is conducted for performance evaluation of pSAIS with the other existing parallel suffix sorting algorithms, on a set of realistic inputs with varying sizes and alphabets. The experiment results show that our program for pSAIS has a high degree of parallelism and achieves the best average time and space performance among all the parallel algorithms in comparison. While pSAIS is designed for quickly building big suffix arrays in a multicore machine, our study may give some hints for extending the induced sorting method to GPU for constructing small suffix arrays even faster.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Footnotes
1
Both programs were downloaded from http://​code.​google.​com/​p/​ge-nong/​.
 
Literature
1.
go back to reference Arabnia HR, Oliver MA (1986) Fast operations on raster images with SIMD machine architectures. Comput Graph Forum 5(3):179–188CrossRef Arabnia HR, Oliver MA (1986) Fast operations on raster images with SIMD machine architectures. Comput Graph Forum 5(3):179–188CrossRef
2.
go back to reference Burrows M, Wheeler DJ (1994) A block-sorting lossless data compression algorithm. SRC Res Rep 124:1–24 Burrows M, Wheeler DJ (1994) A block-sorting lossless data compression algorithm. SRC Res Rep 124:1–24
4.
go back to reference Fischer J (2011) Inducing the LCP-array. In: Proceedings of the 12th International Symposium on Algorithms and Data Structures. Springer, pp 374–385 Fischer J (2011) Inducing the LCP-array. In: Proceedings of the 12th International Symposium on Algorithms and Data Structures. Springer, pp 374–385
6.
go back to reference Kärkkäinen J, Kempa D, Puglisi SJ, Zhukova B (2017) Engineering external memory induced suffix sorting. In: Proceedings of the 19th Workshop on Algorithm Engineering and Experiments. SIAM, pp 98–108 Kärkkäinen J, Kempa D, Puglisi SJ, Zhukova B (2017) Engineering external memory induced suffix sorting. In: Proceedings of the 19th Workshop on Algorithm Engineering and Experiments. SIAM, pp 98–108
7.
go back to reference Labeit J, Shun J, Blelloch GE (2017) Parallel lightweight wavelet tree, suffix array and FM-index construction. J Discrete Algorithms 43(1):2–17MathSciNetCrossRefMATH Labeit J, Shun J, Blelloch GE (2017) Parallel lightweight wavelet tree, suffix array and FM-index construction. J Discrete Algorithms 43(1):2–17MathSciNetCrossRefMATH
8.
go back to reference Liu WJ, Nong G, Chan WH, Wu Y (2016) Improving a lightweight LZ77 computation algorithm for running faster. Softw Pract Exp 46(9):1201–1217CrossRef Liu WJ, Nong G, Chan WH, Wu Y (2016) Improving a lightweight LZ77 computation algorithm for running faster. Softw Pract Exp 46(9):1201–1217CrossRef
10.
go back to reference Michailidis PD, Margaritis KG (2016) Scientific computations on multi-core systems using different programming frameworks. Appl Numer Math 104:62–80MathSciNetCrossRefMATH Michailidis PD, Margaritis KG (2016) Scientific computations on multi-core systems using different programming frameworks. Appl Numer Math 104:62–80MathSciNetCrossRefMATH
13.
go back to reference Nawaz MS, Lali MI, Saleem S, Ali H (2014) Comparison of multi-core parallel programming models for divide and conquer algorithms. NED University Journal of Research, pp 1–8 Nawaz MS, Lali MI, Saleem S, Ali H (2014) Comparison of multi-core parallel programming models for divide and conquer algorithms. NED University Journal of Research, pp 1–8
14.
go back to reference Nong G (2013) Practical linear-time O(1)-workspace suffix sorting for constant alphabets. ACM Trans Inf Syst 31(3):15:1–15:15MathSciNetCrossRef Nong G (2013) Practical linear-time O(1)-workspace suffix sorting for constant alphabets. ACM Trans Inf Syst 31(3):15:1–15:15MathSciNetCrossRef
15.
go back to reference Nong G, Zhang S, Chan WH (2011) Two efficient algorithms for linear time suffix array construction. IEEE Trans Comput 60(10):1471–1484MathSciNetCrossRefMATH Nong G, Zhang S, Chan WH (2011) Two efficient algorithms for linear time suffix array construction. IEEE Trans Comput 60(10):1471–1484MathSciNetCrossRefMATH
16.
go back to reference Puglisi SJ, Smyth WF, Turpin AH (2007) A taxonomy of suffix array construction algorithms. ACM Comput Surv 39(2):4:1–4:31CrossRef Puglisi SJ, Smyth WF, Turpin AH (2007) A taxonomy of suffix array construction algorithms. ACM Comput Surv 39(2):4:1–4:31CrossRef
17.
go back to reference Sanchez LM, Fernandez J, Sotomayor R, Escolar S, Garcia JD (2013) A comparative study and evaluation of parallel programming models for shared-memory parallel architectures. New Gener Comput 31(3):139–161CrossRef Sanchez LM, Fernandez J, Sotomayor R, Escolar S, Garcia JD (2013) A comparative study and evaluation of parallel programming models for shared-memory parallel architectures. New Gener Comput 31(3):139–161CrossRef
18.
go back to reference Shrestha AMS, Frith MC, Horton P (2014) A bioinformatician’s guide to the forefront of suffix array construction algorithms. Brief Bioinform 15(2):138–154CrossRef Shrestha AMS, Frith MC, Horton P (2014) A bioinformatician’s guide to the forefront of suffix array construction algorithms. Brief Bioinform 15(2):138–154CrossRef
19.
go back to reference Shun J, Blelloch GE (2014) A simple parallel cartesian tree algorithm and its application to parallel suffix tree construction. ACM Trans Parallel Comput 1(1):8:1–8:20CrossRef Shun J, Blelloch GE (2014) A simple parallel cartesian tree algorithm and its application to parallel suffix tree construction. ACM Trans Parallel Comput 1(1):8:1–8:20CrossRef
20.
go back to reference Shun J, Blelloch GE, Fineman JT, Gibbons PB, Kyrola A, Simhadri HV, Tangwongsan K (2012) Brief announcement: the problem based benchmark suite. In: Proceedings of the Twenty-fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, ACM, pp 68–70 Shun J, Blelloch GE, Fineman JT, Gibbons PB, Kyrola A, Simhadri HV, Tangwongsan K (2012) Brief announcement: the problem based benchmark suite. In: Proceedings of the Twenty-fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, ACM, pp 68–70
21.
go back to reference Valafar H, Arabnia HR, Williams G (2004) Distributed global optimization and its development on the multiring network. Neural Parallel Sci Comput 12(4):465–490MathSciNetMATH Valafar H, Arabnia HR, Williams G (2004) Distributed global optimization and its development on the multiring network. Neural Parallel Sci Comput 12(4):465–490MathSciNetMATH
22.
go back to reference Wang L, Baxter S, Owens JD (2016) Fast parallel skew and prefix-doubling suffix array construction on the GPU. Concurr Comput Pract Exp 28(12):3466–3484CrossRef Wang L, Baxter S, Owens JD (2016) Fast parallel skew and prefix-doubling suffix array construction on the GPU. Concurr Comput Pract Exp 28(12):3466–3484CrossRef
23.
go back to reference Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multiring network. J Supercomput 25(1):43–62CrossRefMATH Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multiring network. J Supercomput 25(1):43–62CrossRefMATH
Metadata
Title
Fast induced sorting suffixes on a multicore machine
Authors
Bin Lao
Ge Nong
Wai Hong Chan
Yi Pan
Publication date
02-05-2018
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 7/2018
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-018-2395-5

Other articles of this Issue 7/2018

The Journal of Supercomputing 7/2018 Go to the issue

Premium Partner