Forum
Special Issue: Computation and Modeling
Are Next-Generation Sequencing Tools Ready for the Cloud?

https://doi.org/10.1016/j.tibtech.2017.03.005Get rights and content

Cloud-based next-generation sequencing (NGS) tools are currently at an early stage. In this Forum article, we provide a clear picture of the current cloud-based NGS solutions and highlight what is still missing, along with future challenges for the achievement of an ecosystem of biotechnology clouds.

Section snippets

Why Is NGS Important?

DNA sequencing is the procedure identifying the exact order of nucleotides (guanine, cytosine, adenine, and thymine) inside a DNA molecule. Analyzing DNA sequences has become fundamental to research in biotechnology in various applied fields such as comparative genomics (e.g., metagenomics, rRNA classification, infectious disease diagnostics), genome analysis and SNP research (e.g., diagnostic approaches, disease prevention, analyzing the structure of mutant proteins), regulation of gene

How the Cloud Can Push the Evolution of NGS

Nevertheless, NGS data are complex and voluminous. Although it is possible to analyze a few nucleic acid fragments with reasonable computing resources, conducting a large number of parallel sequencing tasks means processing a huge amount of data in a short time. The enormous amount of genomics data created by NGS techniques is an example of the well-known ‘big data’ problem, which increases the demand for intensive storage and computing resources and requires significant scalable and

Current NGS Tools over the Cloud

We use the term ‘biotechnology cloud’ to indicate a specific provider supplying various cloud-based biotechnological services to researchers. Figure 1 shows the possible NGS services that a biotechnology-cloud provider can currently offer. As highlighted, comparative genomics, genome analysis, SNP research, and regulation of gene expression are currently the major areas for which researchers have adopted cloud-computing solutions to improve the big-data processing associated with NGS [1].

Toward an Ecosystem of Biotechnology Clouds for NGS

Figure 1 shows how most scientific contributions have exploited the IaaS service level, commonly installing existing bioinformatics software solutions on virtual machines to take advantage of resource scalability. In many cases Amazon EC2 was used to deploy existing applications, and Hadoop was adopted to parallelize the big-data processing associated with NGS. Unfortunately, more evolved PaaS and SaaS paradigms are used only in a few cases. Thus, it is evident that NGS cloud solutions are

References (15)

There are more references available in the full text version of this article.

Cited by (21)

  • Uncovering the dynamics of market convergence through M&A

    2019, Technological Forecasting and Social Change
    Citation Excerpt :

    In this context, novel sequencing technologies produce big datasets that require significant scalable computing resources. Hence, there is a tendency that scientific laboratories adopt the cloud-based solutions to handle the simulation and processing of genomics data (Celesti et al., 2017). Network analysis has been widely adopted in the studies of examining trajectory patterns of technology development, knowledge transfer and research collaborations (Kim et al., 2016; Jacob and Duysters, 2017).

View all citing articles on Scopus
View full text