Skip to main content
Erschienen in:
Buchtitelbild

2020 | OriginalPaper | Buchkapitel

Extracting Insights: A Data Centre Architecture Approach in Million Genome Era

verfasst von : Tariq Abdullah, Ahmed Ahmet

Erschienen in: Transactions on Large-Scale Data- and Knowledge-Centered Systems XLVI

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Advances in high throughput sequencing technologies have resulted in a drastic reduction in genome sequencing price and led to an exponential growth in the generation of genomic sequencing data. The genomics data is often stored on shared repositories and is both heterogeneous and unstructured in nature. It is both technically and culturally residing in big data domain due to the challenges of volume, velocity and variety.
Appropriate data storage and management, processing and analytic models are required to meet the growing challenges of genomic and clinical data. Existing research on the storage, management and analyses of genomic and clinical data do not provide a comprehensive solution, either providing Hadoop based solution lacking a robust computing solution for data mining and knowledge discovery, or a distributed in memory solution that are effective in reducing runtime but lack robustness on data store, resource management, reservation, and scheduling.
In this paper, we present a scalable and elastic framework for genomic data storage, management, and processing that addresses the weaknesses of existing approaches. Fundamental to our framework is a distributed resource management system with a plug and play NoSQL component and an in-memory, distributed computing framework with machine learning and visualisation plugin tools. We evaluated Avro, CSV, HBase, ORC, Parquet datastores and benchmark their performance. A case study of machine learning based genotype clustering is presented to demonstrate and evaluate the effectiveness of the presented framework. The results show an overall performance improvement of the genomics data analysis pipeline by 49% from existing approaches. Finally, we make recommendations on the state of the art technology and tools for effective architecture approaches for the management and knowledge discovery from large datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Abdullah, T., Ahmet, A.: Genomics analyser: a big data framework for analysing genomics data. In: Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, pp. 189–197 (2017) Abdullah, T., Ahmet, A.: Genomics analyser: a big data framework for analysing genomics data. In: Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, pp. 189–197 (2017)
2.
Zurück zum Zitat Bateman, A., Wood, M.: Cloud computing. Bioinformatics 25(12), 1475 (2009)CrossRef Bateman, A., Wood, M.: Cloud computing. Bioinformatics 25(12), 1475 (2009)CrossRef
3.
Zurück zum Zitat Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Sayers, E.W.: GenBank. Nucl. Acids Res. 37(Database), D26–D31 (2009) Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Sayers, E.W.: GenBank. Nucl. Acids Res. 37(Database), D26–D31 (2009)
4.
Zurück zum Zitat Brien, A.R.O., Saunders, N.F.W., Guo, Y., Buske, F.A., Scott, R.J., Bauer, D.C.: VariantSpark: population scale clustering of genotype information. BMC Genomics 16, 1–9 (2015)CrossRef Brien, A.R.O., Saunders, N.F.W., Guo, Y., Buske, F.A., Scott, R.J., Bauer, D.C.: VariantSpark: population scale clustering of genotype information. BMC Genomics 16, 1–9 (2015)CrossRef
5.
Zurück zum Zitat Shaffer, C.: Next-generation sequencing outpaces expectations. Nat. Biotechnol. 25 (2007) Shaffer, C.: Next-generation sequencing outpaces expectations. Nat. Biotechnol. 25 (2007)
6.
Zurück zum Zitat Carter, R.J., Dubchak, I., Holbrook, S.R.: A computational approach to identify genes for functional RNAs in genomic sequences. Nucl. Acids Res. 29(19), 3928–3938 (2001)CrossRef Carter, R.J., Dubchak, I., Holbrook, S.R.: A computational approach to identify genes for functional RNAs in genomic sequences. Nucl. Acids Res. 29(19), 3928–3938 (2001)CrossRef
7.
Zurück zum Zitat Hayden, E.C.: Genome researchers raise alarm over big data. Nature (2015) Hayden, E.C.: Genome researchers raise alarm over big data. Nature (2015)
8.
Zurück zum Zitat Chen, X., Jorgenson, E., Cheung, S.: New tools for functional genomic analysis. Drug Discov. Today 14(15), 754–760 (2009)CrossRef Chen, X., Jorgenson, E., Cheung, S.: New tools for functional genomic analysis. Drug Discov. Today 14(15), 754–760 (2009)CrossRef
9.
Zurück zum Zitat The 1000 Genome Project Consortium: A global reference for human genetic variations. Nature 256, 68–78 (2015) The 1000 Genome Project Consortium: A global reference for human genetic variations. Nature 256, 68–78 (2015)
10.
Zurück zum Zitat Cook, C.E., Bergman, M.T., Cochrane, G., Apweiler, R., Birney, E.: The European bioinformatics institute in 2017: data coordination and integration. Nucl. Acids Res. 29(19), 3928–3938 (2017) Cook, C.E., Bergman, M.T., Cochrane, G., Apweiler, R., Birney, E.: The European bioinformatics institute in 2017: data coordination and integration. Nucl. Acids Res. 29(19), 3928–3938 (2017)
11.
Zurück zum Zitat Coonrod, E., Margraf, R., Russell, A., Voelkerding, K., Reese, M.: Clinical analysis of genome next-generation sequencing data using the Omicia platform. Expert. Rev. Mol. Diagn. 13(6), 529–540 (2013)CrossRef Coonrod, E., Margraf, R., Russell, A., Voelkerding, K., Reese, M.: Clinical analysis of genome next-generation sequencing data using the Omicia platform. Expert. Rev. Mol. Diagn. 13(6), 529–540 (2013)CrossRef
12.
Zurück zum Zitat Davies, K.: The 1,000 Dollar Genome - The Revolution in DNA Sequencing and the New Era of Personalized Medicine. Free Press (2010) Davies, K.: The 1,000 Dollar Genome - The Revolution in DNA Sequencing and the New Era of Personalized Medicine. Free Press (2010)
13.
Zurück zum Zitat de Paula, R., Holanda, M., Gomes, L.S.A., Lifschitz, S., Walter, M.E.M.T.: Provenance in bioinformatics workflows. In: BMC Bioinformatics Workshops (2013) de Paula, R., Holanda, M., Gomes, L.S.A., Lifschitz, S., Walter, M.E.M.T.: Provenance in bioinformatics workflows. In: BMC Bioinformatics Workshops (2013)
14.
Zurück zum Zitat Decap, D., Reumers, J., Herzeel, C., Costanza, P., Fostier, J.: Halvade: scalable sequence analysis with MapReduce. Bioinformatics 31(15), 2482–2488 (2015)CrossRef Decap, D., Reumers, J., Herzeel, C., Costanza, P., Fostier, J.: Halvade: scalable sequence analysis with MapReduce. Bioinformatics 31(15), 2482–2488 (2015)CrossRef
15.
Zurück zum Zitat Ding, L., Wendl, M., Koboldt, D., Mardis, E.: Analysis of next-generation genomic data in cancer: accomplishments and challenges. Hum. Mol. Genet. 19(2), 188–196 (2010)CrossRef Ding, L., Wendl, M., Koboldt, D., Mardis, E.: Analysis of next-generation genomic data in cancer: accomplishments and challenges. Hum. Mol. Genet. 19(2), 188–196 (2010)CrossRef
16.
Zurück zum Zitat EMBL-EBI. EMBL-EBI annual scientific report 2013. Technical report, EMBL-European Bioinformatics Institute (2014) EMBL-EBI. EMBL-EBI annual scientific report 2013. Technical report, EMBL-European Bioinformatics Institute (2014)
17.
Zurück zum Zitat Borozan, I., et al.: CaPSID: a bioinformatics platform for computational pathogen sequence identification in human genome and transcriptomes. BMC Bioinform. 13, 1–11 (2012)CrossRef Borozan, I., et al.: CaPSID: a bioinformatics platform for computational pathogen sequence identification in human genome and transcriptomes. BMC Bioinform. 13, 1–11 (2012)CrossRef
19.
Zurück zum Zitat Guo, X., Meng, Y., Yu, N., Pan, Y.: Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering. BMC Bioinform. 15(1), 102 (2014)CrossRef Guo, X., Meng, Y., Yu, N., Pan, Y.: Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering. BMC Bioinform. 15(1), 102 (2014)CrossRef
20.
21.
Zurück zum Zitat Huang, H., Tata, S., Prill, R.J.: BlueSNP. R package for highly scalable genome-wide association studies using Hadoop clusters. Bioinformatics 29(1), 135–136 (2013)CrossRef Huang, H., Tata, S., Prill, R.J.: BlueSNP. R package for highly scalable genome-wide association studies using Hadoop clusters. Bioinformatics 29(1), 135–136 (2013)CrossRef
22.
Zurück zum Zitat Huang, L., Kruger, J., Sczyrba, A.: Analyzing large scale genomic data on the cloud with Sparkhit. Bioinformatics 34(9), 1457–1465 (2017)CrossRef Huang, L., Kruger, J., Sczyrba, A.: Analyzing large scale genomic data on the cloud with Sparkhit. Bioinformatics 34(9), 1457–1465 (2017)CrossRef
24.
Zurück zum Zitat Tian, J., Wu, N., Guo, X., Guo, J., Zhang, J., Fan, Y.: Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines. BMC Bioinform. 8, 450–546 (2007) CrossRef Tian, J., Wu, N., Guo, X., Guo, J., Zhang, J., Fan, Y.: Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines. BMC Bioinform. 8, 450–546 (2007) CrossRef
25.
Zurück zum Zitat Jourdren, L., Bernard, M., Dillies, M.A.L., Crom, S.: Eoulsan. A cloud computing-based framework facilitating high throughput sequencing analyses. Bioinformatics 28(11), 1542–1543 (2012)CrossRef Jourdren, L., Bernard, M., Dillies, M.A.L., Crom, S.: Eoulsan. A cloud computing-based framework facilitating high throughput sequencing analyses. Bioinformatics 28(11), 1542–1543 (2012)CrossRef
26.
Zurück zum Zitat Kelly, B.J., et al.: Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics. Genome Biol. 16(1), 6 (2015) Kelly, B.J., et al.: Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics. Genome Biol. 16(1), 6 (2015)
27.
Zurück zum Zitat Klinger, J., Mateos-Garcia, J.C., Stathoulopoulos, K.: Deep learning, deep change? Mapping the development of the artificial intelligence general purpose technology. Mapp. Dev. Artif. Intell. Gen. Purp. Technol. (2018) Klinger, J., Mateos-Garcia, J.C., Stathoulopoulos, K.: Deep learning, deep change? Mapping the development of the artificial intelligence general purpose technology. Mapp. Dev. Artif. Intell. Gen. Purp. Technol. (2018)
28.
Zurück zum Zitat Kozanitis, C., Patterson, D.A.: GenAP: a distributed SQL interface for genomic data. BMC Bioinformat. 17(63) (2016) Kozanitis, C., Patterson, D.A.: GenAP: a distributed SQL interface for genomic data. BMC Bioinformat. 17(63) (2016)
30.
Zurück zum Zitat Langmead, B., Schatz, M.C., Lin, J., Pop, M., Salzberg, S.L.: Searching for SNPs with cloud computing. Genome Biol. 10(11), 134:1–134:10 (2009)CrossRef Langmead, B., Schatz, M.C., Lin, J., Pop, M., Salzberg, S.L.: Searching for SNPs with cloud computing. Genome Biol. 10(11), 134:1–134:10 (2009)CrossRef
31.
Zurück zum Zitat Langmead, B., Schatz, M.C., Lin, J., Pop, M., Salzberg, S.L.: Searching for SNPs with cloud computing. Genome Biol. 10(11), R134 (2009)CrossRef Langmead, B., Schatz, M.C., Lin, J., Pop, M., Salzberg, S.L.: Searching for SNPs with cloud computing. Genome Biol. 10(11), R134 (2009)CrossRef
32.
Zurück zum Zitat Lu, W., Jackson, J., Barga, R.: AzureBlast: a case study of developing science applications on the cloud. In: 19th ACM International Symposium on High Performance Distributed Computing, pp. 413–420 (2010) Lu, W., Jackson, J., Barga, R.: AzureBlast: a case study of developing science applications on the cloud. In: 19th ACM International Symposium on High Performance Distributed Computing, pp. 413–420 (2010)
33.
Zurück zum Zitat Mardis, E.R.: The impact of next-generation sequencing technology on genetics. Trends Genet. 24(3), 133–141 (2008)CrossRef Mardis, E.R.: The impact of next-generation sequencing technology on genetics. Trends Genet. 24(3), 133–141 (2008)CrossRef
34.
Zurück zum Zitat Massie, M., et al.: Adam: genomics formats and processing patterns for cloud scale computing. Technical report UCB/EECS-2013-207, EECS Department, University of California, Berkeley, December 2013 Massie, M., et al.: Adam: genomics formats and processing patterns for cloud scale computing. Technical report UCB/EECS-2013-207, EECS Department, University of California, Berkeley, December 2013
35.
Zurück zum Zitat Mohammed, E.A., Far, B.H., Naugler, C.: Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends. BioData Min. 7(1), 1–23 (2014)CrossRef Mohammed, E.A., Far, B.H., Naugler, C.: Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends. BioData Min. 7(1), 1–23 (2014)CrossRef
36.
Zurück zum Zitat Wiewiorka, M.S., Messina, A., Pacholewska, A., Maffioletti, S., Gawrysiak, P., Okoniewski, M.J.: SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision. Bioinformatics 15(30), 2652–2653 (2014)CrossRef Wiewiorka, M.S., Messina, A., Pacholewska, A., Maffioletti, S., Gawrysiak, P., Okoniewski, M.J.: SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision. Bioinformatics 15(30), 2652–2653 (2014)CrossRef
37.
Zurück zum Zitat Nordberg, H., Bhatia, K., Wang, K., Wang, Z.: BioPig: a Hadoop-based analytic toolkit for large-scale sequence data. Bioinformatics 29(23), 3014–3019 (2013)CrossRef Nordberg, H., Bhatia, K., Wang, K., Wang, Z.: BioPig: a Hadoop-based analytic toolkit for large-scale sequence data. Bioinformatics 29(23), 3014–3019 (2013)CrossRef
38.
Zurück zum Zitat Norrgard, K.: Genetic variation and disease: GWAS. Nat. Educ. 1(1), 87(2008) Norrgard, K.: Genetic variation and disease: GWAS. Nat. Educ. 1(1), 87(2008)
39.
Zurück zum Zitat O’Connor, B.D., Merriman, B., Nelson, S.F.: SeqWare query engine: storing and searching sequence data in the cloud. BMC Bioinform. 11(Suppl. 12), S2 (2010) O’Connor, B.D., Merriman, B., Nelson, S.F.: SeqWare query engine: storing and searching sequence data in the cloud. BMC Bioinform. 11(Suppl. 12), S2 (2010)
40.
Zurück zum Zitat Oliveira, J.H., Holanda, M., Guimaraes, V., Hondo, F., Filho, W.: Data modeling for NoSQL based on document. In: Second Annual International Symposium on Information Management and Big Data, pp. 129–135 (2015) Oliveira, J.H., Holanda, M., Guimaraes, V., Hondo, F., Filho, W.: Data modeling for NoSQL based on document. In: Second Annual International Symposium on Information Management and Big Data, pp. 129–135 (2015)
41.
Zurück zum Zitat Pinheiro, R., Holanda, M., Arujo, A., Walter, M.E.M.T., Lifschitz, S.: Automatic capture of provenance data in genome project workflows. In: IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 15–21 (2013) Pinheiro, R., Holanda, M., Arujo, A., Walter, M.E.M.T., Lifschitz, S.: Automatic capture of provenance data in genome project workflows. In: IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 15–21 (2013)
42.
Zurück zum Zitat Pinherio, R., Holanda, M., Araujo, A., Walter, M.E.M.t., Lifschitz., S.: Storing provenance data of genome project workflows using graph databases. In: IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 16–22 (2014) Pinherio, R., Holanda, M., Araujo, A., Walter, M.E.M.t., Lifschitz., S.: Storing provenance data of genome project workflows using graph databases. In: IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 16–22 (2014)
43.
Zurück zum Zitat Pireddu, L., Leo, S., Zanetti, G.: Seal: a distributed short read mapping and duplicate removal tool. Bioinformatics 27(15), 2159–2160 (2011)CrossRef Pireddu, L., Leo, S., Zanetti, G.: Seal: a distributed short read mapping and duplicate removal tool. Bioinformatics 27(15), 2159–2160 (2011)CrossRef
44.
Zurück zum Zitat Poplin, R., et al.: A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36(10), 983–987 (2018)CrossRef Poplin, R., et al.: A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36(10), 983–987 (2018)CrossRef
45.
Zurück zum Zitat 1000 Genomes Project. Data types and file formats 1000 Genomes Project. Data types and file formats
46.
Zurück zum Zitat Zou, Q., Li, X.B., Jiang, W.R., Lin, Z.Y., Li, G.L., Chen, K.: Survey of MapReduce frame operation in bioinformatics. Brief. Bioinform. 15, 637–647 (2014)CrossRef Zou, Q., Li, X.B., Jiang, W.R., Lin, Z.Y., Li, G.L., Chen, K.: Survey of MapReduce frame operation in bioinformatics. Brief. Bioinform. 15, 637–647 (2014)CrossRef
47.
Zurück zum Zitat Qiu, J., et al.: Hybrid cloud and cluster computing paradigms for life science applications. BMC Bioinform. 11(12), 1–6 (2010). BioMed Central Qiu, J., et al.: Hybrid cloud and cluster computing paradigms for life science applications. BMC Bioinform. 11(12), 1–6 (2010). BioMed Central
48.
Zurück zum Zitat Quail, M.A., et al.: A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 13(1), 1–13 (2012). BioMed Central Quail, M.A., et al.: A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 13(1), 1–13 (2012). BioMed Central
49.
Zurück zum Zitat Robinson, T., Killcoyne, S., Bressler, R., Boyle, J.: SAMQA: error classification and validation of high-throughput sequenced read data. BMC Genomics 12, 419 (2011)CrossRef Robinson, T., Killcoyne, S., Bressler, R., Boyle, J.: SAMQA: error classification and validation of high-throughput sequenced read data. BMC Genomics 12, 419 (2011)CrossRef
50.
Zurück zum Zitat Schatz, M.C.: Cloudburst: highly sensitive read mapping with MapReduce. Bioinformatics 25(11), 1363–1369 (2009)CrossRef Schatz, M.C.: Cloudburst: highly sensitive read mapping with MapReduce. Bioinformatics 25(11), 1363–1369 (2009)CrossRef
51.
Zurück zum Zitat Schoenherr, S., Forer, L., Weissensteiner, H., Specht, G., Kronenberg, F., Kloss-Brandstaetter, A.: Cloudgene: a graphical execution platform for MapReduce programs on private and public clouds. BMC Bioinform. 13(1), 200 (2012)CrossRef Schoenherr, S., Forer, L., Weissensteiner, H., Specht, G., Kronenberg, F., Kloss-Brandstaetter, A.: Cloudgene: a graphical execution platform for MapReduce programs on private and public clouds. BMC Bioinform. 13(1), 200 (2012)CrossRef
52.
Zurück zum Zitat Schumacher, A., et al.: SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop. Bioinformatics 30(1), 119–120 (2014)CrossRef Schumacher, A., et al.: SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop. Bioinformatics 30(1), 119–120 (2014)CrossRef
53.
Zurück zum Zitat Stein, L.D.: The case for cloud computing in genome informatics. Genome Biol. 11(5), 207 (2010)CrossRef Stein, L.D.: The case for cloud computing in genome informatics. Genome Biol. 11(5), 207 (2010)CrossRef
54.
Zurück zum Zitat Stephens, Z.D., et al.: Big data: astronomical or genomical? PLoS Biol. 13(7), e1002195 (2015) Stephens, Z.D., et al.: Big data: astronomical or genomical? PLoS Biol. 13(7), e1002195 (2015)
55.
Zurück zum Zitat Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinform. 11(S12), S1 (2010). Springer Taylor, R.C.: An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinform. 11(S12), S1 (2010). Springer
56.
Zurück zum Zitat Wong, K.-C., Zhang, Z.: SNPdryad: predicting deleterious nonsynonymous human SNPs using only orthologous protein sequences. Bioinformatics 30(8), 1112–1119 (2014)CrossRef Wong, K.-C., Zhang, Z.: SNPdryad: predicting deleterious nonsynonymous human SNPs using only orthologous protein sequences. Bioinformatics 30(8), 1112–1119 (2014)CrossRef
57.
Zurück zum Zitat Yin, Z., Lan, H., Tan, G., Lu, M., Vasilakos, A., Liu, W.: Computing platforms for big biological data analytics: perspectives and challenges. Comput. Struct. Biotechnol. J. 15, 403–411 (2017)CrossRef Yin, Z., Lan, H., Tan, G., Lu, M., Vasilakos, A., Liu, W.: Computing platforms for big biological data analytics: perspectives and challenges. Comput. Struct. Biotechnol. J. 15, 403–411 (2017)CrossRef
Metadaten
Titel
Extracting Insights: A Data Centre Architecture Approach in Million Genome Era
verfasst von
Tariq Abdullah
Ahmed Ahmet
Copyright-Jahr
2020
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-62386-2_1