skip to main content
10.1145/2484762.2484827acmotherconferencesArticle/Chapter ViewAbstractPublication PagesxsedeConference Proceedingsconference-collections
research-article

Experiences in building a next-generation sequencing analysis service using galaxy, globus online and Amazon web service

Published:22 July 2013Publication History

ABSTRACT

We describe Globus Genomics, a system that we have developed for rapid analysis of large quantities of next-generation sequencing (NGS) genomic data. This system is notable for its high degree of end-to-end automation, which encompasses every stage of the data analysis pipeline from initial data access (from remote sequencing center or database, by the Globus Online file transfer system) to on-demand resource acquisition (on Amazon EC2, via the Globus Provision cloud manager); specification, configuration, and reuse of multi-step processing pipelines (via the Galaxy workflow system); and efficient scheduling of these pipelines over many processors (via the Condor scheduler). The system allows biomedical researchers to perform rapid analysis of large NGS datasets using just a web browser in a fully automated manner, without software installation.

References

  1. Goecks, J., A. Nekrutenko, and J. Taylor, Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol, 2010. 11(8): p. R86.Google ScholarGoogle ScholarCross RefCross Ref
  2. Foster, I., Globus Online: Accelerating and democratizing science through cloud-based services. IEEE Internet Computing, 2011(May/June): p. 70--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Allen, B., et al. Software as a service for data scientists. Communications of the ACM 55.2 (2012): 81--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Amazon Web Services, http://aws.amazon.com/ec2.Google ScholarGoogle Scholar
  5. Afgan, E., et al., Harnessing cloud computing with Galaxy Cloud. Nat Biotech, 2011. 29(11): p. 972--974.Google ScholarGoogle ScholarCross RefCross Ref
  6. Thain, D., T. Tannenbaum, and M. Livny, Distributed Computing in Practice: The Condor Experience. Concurrency and Computation: Practice and Experience, 2005. 17(2-4): p. 323--356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bamshad et al., Exome sequencing as a tool for Mendelian disease gene discovery. Nature Reviews Genetics 12, 745--755 (November 2011) | doi:10.1038/nrg3031.Google ScholarGoogle ScholarCross RefCross Ref
  8. Matthew Meyerson, Stacey Gabriel and Gad Getz: Advances in understanding cancer genomes through second-generation sequencing: doi:10.1038/nrg2841.Google ScholarGoogle Scholar
  9. Lincoln D. Stein: Genome Biology 2010 11:207 doi:10.1186/gb-2010-11-5-207.Google ScholarGoogle Scholar
  10. Elizabeth Pennisi: Will Computers Crash Genomics? Science 11 February 2011: Vol. 331 no. 6018 pp. 666--668 DOI: 10.1126/science.331.6018.666.Google ScholarGoogle Scholar
  11. Lincoln D Stein: The case for cloud computing in genome informatics. Genome Biology 2010, 11:207 doi:10.1186/gb-2010-11-5-207.Google ScholarGoogle Scholar

Index Terms

  1. Experiences in building a next-generation sequencing analysis service using galaxy, globus online and Amazon web service

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              XSEDE '13: Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery
              July 2013
              433 pages
              ISBN:9781450321709
              DOI:10.1145/2484762

              Copyright © 2013 Copyright is held by the owner/author(s)

              Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 22 July 2013

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate129of190submissions,68%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader