Skip to main content

2018 | OriginalPaper | Buchkapitel

Parka: A Parallel Implementation of BLAST with MapReduce

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Bioinformatics applications have become more data-intensive and compute-intensive, which requires an effective method to implement parallel computing and get a high-throughput. Although there exists some tools to realize parallelization of BLAST, but most of them depend on complex platforms or software. A parallel BLAST is implemented using Spark, which is called Parka. The parallel execution time and speedup of Parka are evaluated in a cluster environment. Then, it is compared with Hadoop-based parallelization method. Results show that it is a scalable and effective parallelization approach for sequence alignment.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)CrossRef Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)CrossRef
2.
Zurück zum Zitat Darling, A.E., Carey, L., Feng, W.: The design, implementation, and evaluation of mpiBLAST. In: ClusterWorld Conference & Expo and the 4th International Conference on Linux Clusters: The HPC Revolution (2003) Darling, A.E., Carey, L., Feng, W.: The design, implementation, and evaluation of mpiBLAST. In: ClusterWorld Conference & Expo and the 4th International Conference on Linux Clusters: The HPC Revolution (2003)
3.
Zurück zum Zitat Bjornson, R.D., Sherman, A.H., Weston, S.B., Willard, N., Wing, J.: TurboBLAST: a parallel implementation of BLAST build on the TurboHub. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’02) (2002) Bjornson, R.D., Sherman, A.H., Weston, S.B., Willard, N., Wing, J.: TurboBLAST: a parallel implementation of BLAST build on the TurboHub. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’02) (2002)
4.
Zurück zum Zitat Vouzis, P.D., Sahinidis, N.V.: GPU-BLAST: using graphics processors to accelerate protein sequence alignment. Bioinformatics 27(2), 182–188 (2011)CrossRef Vouzis, P.D., Sahinidis, N.V.: GPU-BLAST: using graphics processors to accelerate protein sequence alignment. Bioinformatics 27(2), 182–188 (2011)CrossRef
5.
Zurück zum Zitat Sun, Y., Zhao, S., Yu, H., Gao, G., Luo, J.: ABCGrid: application for bioinformatics computing grid. Bioinformatics 23(9), 1175–1177 (2007)CrossRef Sun, Y., Zhao, S., Yu, H., Gao, G., Luo, J.: ABCGrid: application for bioinformatics computing grid. Bioinformatics 23(9), 1175–1177 (2007)CrossRef
6.
Zurück zum Zitat Yang, C.T., Han, T.F., Kan, H.C.: G-BLAST: a grid-based solution for mpiBLAST on computational Grids. Concurrency Comput. Pract. Exper. 21(2), 225–255 (2009) Yang, C.T., Han, T.F., Kan, H.C.: G-BLAST: a grid-based solution for mpiBLAST on computational Grids. Concurrency Comput. Pract. Exper. 21(2), 225–255 (2009)
7.
Zurück zum Zitat Mirto, M., Fiore, S., Epicoco, I., Cafaro, M., Mocavero, S., Blasi, E., Aloisio, G.: A bioinfomatics grid alignment toolkit. Future Gener. Comput. Syst. 24(7), 752–762 (2008)CrossRef Mirto, M., Fiore, S., Epicoco, I., Cafaro, M., Mocavero, S., Blasi, E., Aloisio, G.: A bioinfomatics grid alignment toolkit. Future Gener. Comput. Syst. 24(7), 752–762 (2008)CrossRef
8.
Zurück zum Zitat He, H., Fedak, G., Tang, B., Cappello, F.: BLAST application with data-aware desktop grid middleware. In: Proceedings of the 9th IEEE International Symposium on Cluster Computing and the Grid (CCGrid’09), pp. 284–291 (2009) He, H., Fedak, G., Tang, B., Cappello, F.: BLAST application with data-aware desktop grid middleware. In: Proceedings of the 9th IEEE International Symposium on Cluster Computing and the Grid (CCGrid’09), pp. 284–291 (2009)
9.
Zurück zum Zitat Fedak, G., He, H., Cappello, F.: BitDew: A data management and distribution service with multi-protocol file transfer and metadata abstraction. J. Netw. Comput. Appl. 32(5), 961–975 (2009)CrossRef Fedak, G., He, H., Cappello, F.: BitDew: A data management and distribution service with multi-protocol file transfer and metadata abstraction. J. Netw. Comput. Appl. 32(5), 961–975 (2009)CrossRef
10.
Zurück zum Zitat Matsunaga, A., Tsugawa, M., Fortes, J.: CloudBLAST: combining MapReduce and virtualization on distributed resources for bioinformatics applications. In: Proceeding of the Fourth IEEE International Conference on e-Science, pp. 222–229 (2008) Matsunaga, A., Tsugawa, M., Fortes, J.: CloudBLAST: combining MapReduce and virtualization on distributed resources for bioinformatics applications. In: Proceeding of the Fourth IEEE International Conference on e-Science, pp. 222–229 (2008)
11.
Zurück zum Zitat Schatz, M.C.: CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics 25(11), 1363–1369 (2009)CrossRef Schatz, M.C.: CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics 25(11), 1363–1369 (2009)CrossRef
12.
Zurück zum Zitat Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: HotCloud 2010, USENIX Association, pp. 1–7 (2010) Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: HotCloud 2010, USENIX Association, pp. 1–7 (2010)
13.
Zurück zum Zitat Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI 2012, USENIX Association, pp. 15–28 (2012) Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI 2012, USENIX Association, pp. 15–28 (2012)
Metadaten
Titel
Parka: A Parallel Implementation of BLAST with MapReduce
verfasst von
Li Zhang
Bing Tang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-69096-4_26