Skip to main content
Erschienen in: The Journal of Supercomputing 7/2018

02.05.2018

Fast induced sorting suffixes on a multicore machine

verfasst von: Bin Lao, Ge Nong, Wai Hong Chan, Yi Pan

Erschienen in: The Journal of Supercomputing | Ausgabe 7/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Sorting the suffixes of an input string is a fundamental task in many applications such as data compression, genome alignment, and full-text search. The induced sorting (IS) method has been successfully applied to design a number of state-of-the-art suffix sorting algorithms. In particular, the SAIS algorithm designed by the IS method is not only linear in time but also fast in practice. However, the parallelization of SAIS remains a challenge due to that the IS process in the algorithm is inherently sequential and the performance bottleneck of the whole algorithm. This article presents our attempt for designing a parallel variant of SAIS on a multicore machine which is considered as a shared memory parallel model, called pSAIS. By a combined use of multithreading and pipelining, the inducing process is accelerated by fully utilizing the machine’s parallel computing power. An experimental study is conducted for performance evaluation of pSAIS with the other existing parallel suffix sorting algorithms, on a set of realistic inputs with varying sizes and alphabets. The experiment results show that our program for pSAIS has a high degree of parallelism and achieves the best average time and space performance among all the parallel algorithms in comparison. While pSAIS is designed for quickly building big suffix arrays in a multicore machine, our study may give some hints for extending the induced sorting method to GPU for constructing small suffix arrays even faster.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Both programs were downloaded from http://​code.​google.​com/​p/​ge-nong/​.
 
Literatur
1.
Zurück zum Zitat Arabnia HR, Oliver MA (1986) Fast operations on raster images with SIMD machine architectures. Comput Graph Forum 5(3):179–188CrossRef Arabnia HR, Oliver MA (1986) Fast operations on raster images with SIMD machine architectures. Comput Graph Forum 5(3):179–188CrossRef
2.
Zurück zum Zitat Burrows M, Wheeler DJ (1994) A block-sorting lossless data compression algorithm. SRC Res Rep 124:1–24 Burrows M, Wheeler DJ (1994) A block-sorting lossless data compression algorithm. SRC Res Rep 124:1–24
4.
Zurück zum Zitat Fischer J (2011) Inducing the LCP-array. In: Proceedings of the 12th International Symposium on Algorithms and Data Structures. Springer, pp 374–385 Fischer J (2011) Inducing the LCP-array. In: Proceedings of the 12th International Symposium on Algorithms and Data Structures. Springer, pp 374–385
6.
Zurück zum Zitat Kärkkäinen J, Kempa D, Puglisi SJ, Zhukova B (2017) Engineering external memory induced suffix sorting. In: Proceedings of the 19th Workshop on Algorithm Engineering and Experiments. SIAM, pp 98–108 Kärkkäinen J, Kempa D, Puglisi SJ, Zhukova B (2017) Engineering external memory induced suffix sorting. In: Proceedings of the 19th Workshop on Algorithm Engineering and Experiments. SIAM, pp 98–108
7.
Zurück zum Zitat Labeit J, Shun J, Blelloch GE (2017) Parallel lightweight wavelet tree, suffix array and FM-index construction. J Discrete Algorithms 43(1):2–17MathSciNetCrossRefMATH Labeit J, Shun J, Blelloch GE (2017) Parallel lightweight wavelet tree, suffix array and FM-index construction. J Discrete Algorithms 43(1):2–17MathSciNetCrossRefMATH
8.
Zurück zum Zitat Liu WJ, Nong G, Chan WH, Wu Y (2016) Improving a lightweight LZ77 computation algorithm for running faster. Softw Pract Exp 46(9):1201–1217CrossRef Liu WJ, Nong G, Chan WH, Wu Y (2016) Improving a lightweight LZ77 computation algorithm for running faster. Softw Pract Exp 46(9):1201–1217CrossRef
10.
Zurück zum Zitat Michailidis PD, Margaritis KG (2016) Scientific computations on multi-core systems using different programming frameworks. Appl Numer Math 104:62–80MathSciNetCrossRefMATH Michailidis PD, Margaritis KG (2016) Scientific computations on multi-core systems using different programming frameworks. Appl Numer Math 104:62–80MathSciNetCrossRefMATH
13.
Zurück zum Zitat Nawaz MS, Lali MI, Saleem S, Ali H (2014) Comparison of multi-core parallel programming models for divide and conquer algorithms. NED University Journal of Research, pp 1–8 Nawaz MS, Lali MI, Saleem S, Ali H (2014) Comparison of multi-core parallel programming models for divide and conquer algorithms. NED University Journal of Research, pp 1–8
14.
Zurück zum Zitat Nong G (2013) Practical linear-time O(1)-workspace suffix sorting for constant alphabets. ACM Trans Inf Syst 31(3):15:1–15:15MathSciNetCrossRef Nong G (2013) Practical linear-time O(1)-workspace suffix sorting for constant alphabets. ACM Trans Inf Syst 31(3):15:1–15:15MathSciNetCrossRef
15.
Zurück zum Zitat Nong G, Zhang S, Chan WH (2011) Two efficient algorithms for linear time suffix array construction. IEEE Trans Comput 60(10):1471–1484MathSciNetCrossRefMATH Nong G, Zhang S, Chan WH (2011) Two efficient algorithms for linear time suffix array construction. IEEE Trans Comput 60(10):1471–1484MathSciNetCrossRefMATH
16.
Zurück zum Zitat Puglisi SJ, Smyth WF, Turpin AH (2007) A taxonomy of suffix array construction algorithms. ACM Comput Surv 39(2):4:1–4:31CrossRef Puglisi SJ, Smyth WF, Turpin AH (2007) A taxonomy of suffix array construction algorithms. ACM Comput Surv 39(2):4:1–4:31CrossRef
17.
Zurück zum Zitat Sanchez LM, Fernandez J, Sotomayor R, Escolar S, Garcia JD (2013) A comparative study and evaluation of parallel programming models for shared-memory parallel architectures. New Gener Comput 31(3):139–161CrossRef Sanchez LM, Fernandez J, Sotomayor R, Escolar S, Garcia JD (2013) A comparative study and evaluation of parallel programming models for shared-memory parallel architectures. New Gener Comput 31(3):139–161CrossRef
18.
Zurück zum Zitat Shrestha AMS, Frith MC, Horton P (2014) A bioinformatician’s guide to the forefront of suffix array construction algorithms. Brief Bioinform 15(2):138–154CrossRef Shrestha AMS, Frith MC, Horton P (2014) A bioinformatician’s guide to the forefront of suffix array construction algorithms. Brief Bioinform 15(2):138–154CrossRef
19.
Zurück zum Zitat Shun J, Blelloch GE (2014) A simple parallel cartesian tree algorithm and its application to parallel suffix tree construction. ACM Trans Parallel Comput 1(1):8:1–8:20CrossRef Shun J, Blelloch GE (2014) A simple parallel cartesian tree algorithm and its application to parallel suffix tree construction. ACM Trans Parallel Comput 1(1):8:1–8:20CrossRef
20.
Zurück zum Zitat Shun J, Blelloch GE, Fineman JT, Gibbons PB, Kyrola A, Simhadri HV, Tangwongsan K (2012) Brief announcement: the problem based benchmark suite. In: Proceedings of the Twenty-fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, ACM, pp 68–70 Shun J, Blelloch GE, Fineman JT, Gibbons PB, Kyrola A, Simhadri HV, Tangwongsan K (2012) Brief announcement: the problem based benchmark suite. In: Proceedings of the Twenty-fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, ACM, pp 68–70
21.
Zurück zum Zitat Valafar H, Arabnia HR, Williams G (2004) Distributed global optimization and its development on the multiring network. Neural Parallel Sci Comput 12(4):465–490MathSciNetMATH Valafar H, Arabnia HR, Williams G (2004) Distributed global optimization and its development on the multiring network. Neural Parallel Sci Comput 12(4):465–490MathSciNetMATH
22.
Zurück zum Zitat Wang L, Baxter S, Owens JD (2016) Fast parallel skew and prefix-doubling suffix array construction on the GPU. Concurr Comput Pract Exp 28(12):3466–3484CrossRef Wang L, Baxter S, Owens JD (2016) Fast parallel skew and prefix-doubling suffix array construction on the GPU. Concurr Comput Pract Exp 28(12):3466–3484CrossRef
23.
Zurück zum Zitat Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multiring network. J Supercomput 25(1):43–62CrossRefMATH Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multiring network. J Supercomput 25(1):43–62CrossRefMATH
Metadaten
Titel
Fast induced sorting suffixes on a multicore machine
verfasst von
Bin Lao
Ge Nong
Wai Hong Chan
Yi Pan
Publikationsdatum
02.05.2018
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 7/2018
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-018-2395-5

Weitere Artikel der Ausgabe 7/2018

The Journal of Supercomputing 7/2018 Zur Ausgabe