Skip to main content
Top

2018 | OriginalPaper | Chapter

Parka: A Parallel Implementation of BLAST with MapReduce

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Bioinformatics applications have become more data-intensive and compute-intensive, which requires an effective method to implement parallel computing and get a high-throughput. Although there exists some tools to realize parallelization of BLAST, but most of them depend on complex platforms or software. A parallel BLAST is implemented using Spark, which is called Parka. The parallel execution time and speedup of Parka are evaluated in a cluster environment. Then, it is compared with Hadoop-based parallelization method. Results show that it is a scalable and effective parallelization approach for sequence alignment.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)CrossRef Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)CrossRef
2.
go back to reference Darling, A.E., Carey, L., Feng, W.: The design, implementation, and evaluation of mpiBLAST. In: ClusterWorld Conference & Expo and the 4th International Conference on Linux Clusters: The HPC Revolution (2003) Darling, A.E., Carey, L., Feng, W.: The design, implementation, and evaluation of mpiBLAST. In: ClusterWorld Conference & Expo and the 4th International Conference on Linux Clusters: The HPC Revolution (2003)
3.
go back to reference Bjornson, R.D., Sherman, A.H., Weston, S.B., Willard, N., Wing, J.: TurboBLAST: a parallel implementation of BLAST build on the TurboHub. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’02) (2002) Bjornson, R.D., Sherman, A.H., Weston, S.B., Willard, N., Wing, J.: TurboBLAST: a parallel implementation of BLAST build on the TurboHub. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’02) (2002)
4.
go back to reference Vouzis, P.D., Sahinidis, N.V.: GPU-BLAST: using graphics processors to accelerate protein sequence alignment. Bioinformatics 27(2), 182–188 (2011)CrossRef Vouzis, P.D., Sahinidis, N.V.: GPU-BLAST: using graphics processors to accelerate protein sequence alignment. Bioinformatics 27(2), 182–188 (2011)CrossRef
5.
go back to reference Sun, Y., Zhao, S., Yu, H., Gao, G., Luo, J.: ABCGrid: application for bioinformatics computing grid. Bioinformatics 23(9), 1175–1177 (2007)CrossRef Sun, Y., Zhao, S., Yu, H., Gao, G., Luo, J.: ABCGrid: application for bioinformatics computing grid. Bioinformatics 23(9), 1175–1177 (2007)CrossRef
6.
go back to reference Yang, C.T., Han, T.F., Kan, H.C.: G-BLAST: a grid-based solution for mpiBLAST on computational Grids. Concurrency Comput. Pract. Exper. 21(2), 225–255 (2009) Yang, C.T., Han, T.F., Kan, H.C.: G-BLAST: a grid-based solution for mpiBLAST on computational Grids. Concurrency Comput. Pract. Exper. 21(2), 225–255 (2009)
7.
go back to reference Mirto, M., Fiore, S., Epicoco, I., Cafaro, M., Mocavero, S., Blasi, E., Aloisio, G.: A bioinfomatics grid alignment toolkit. Future Gener. Comput. Syst. 24(7), 752–762 (2008)CrossRef Mirto, M., Fiore, S., Epicoco, I., Cafaro, M., Mocavero, S., Blasi, E., Aloisio, G.: A bioinfomatics grid alignment toolkit. Future Gener. Comput. Syst. 24(7), 752–762 (2008)CrossRef
8.
go back to reference He, H., Fedak, G., Tang, B., Cappello, F.: BLAST application with data-aware desktop grid middleware. In: Proceedings of the 9th IEEE International Symposium on Cluster Computing and the Grid (CCGrid’09), pp. 284–291 (2009) He, H., Fedak, G., Tang, B., Cappello, F.: BLAST application with data-aware desktop grid middleware. In: Proceedings of the 9th IEEE International Symposium on Cluster Computing and the Grid (CCGrid’09), pp. 284–291 (2009)
9.
go back to reference Fedak, G., He, H., Cappello, F.: BitDew: A data management and distribution service with multi-protocol file transfer and metadata abstraction. J. Netw. Comput. Appl. 32(5), 961–975 (2009)CrossRef Fedak, G., He, H., Cappello, F.: BitDew: A data management and distribution service with multi-protocol file transfer and metadata abstraction. J. Netw. Comput. Appl. 32(5), 961–975 (2009)CrossRef
10.
go back to reference Matsunaga, A., Tsugawa, M., Fortes, J.: CloudBLAST: combining MapReduce and virtualization on distributed resources for bioinformatics applications. In: Proceeding of the Fourth IEEE International Conference on e-Science, pp. 222–229 (2008) Matsunaga, A., Tsugawa, M., Fortes, J.: CloudBLAST: combining MapReduce and virtualization on distributed resources for bioinformatics applications. In: Proceeding of the Fourth IEEE International Conference on e-Science, pp. 222–229 (2008)
11.
go back to reference Schatz, M.C.: CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics 25(11), 1363–1369 (2009)CrossRef Schatz, M.C.: CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics 25(11), 1363–1369 (2009)CrossRef
12.
go back to reference Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: HotCloud 2010, USENIX Association, pp. 1–7 (2010) Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: HotCloud 2010, USENIX Association, pp. 1–7 (2010)
13.
go back to reference Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI 2012, USENIX Association, pp. 15–28 (2012) Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI 2012, USENIX Association, pp. 15–28 (2012)
Metadata
Title
Parka: A Parallel Implementation of BLAST with MapReduce
Authors
Li Zhang
Bing Tang
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-69096-4_26

Premium Partner