Skip to main content
Top
Published in: The Journal of Supercomputing 1/2014

01-10-2014

Cloud computing in e-Science: research challenges and opportunities

Authors: Xiaoyu Yang, David Wallom, Simon Waddington, Jianwu Wang, Arif Shaon, Brian Matthews, Michael Wilson, Yike Guo, Li Guo, Jon D. Blower, Athanasios V. Vasilakos, Kecheng Liu, Philip Kershaw

Published in: The Journal of Supercomputing | Issue 1/2014

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Service-oriented architecture (SOA), workflow, the Semantic Web, and Grid computing are key enabling information technologies in the development of increasingly sophisticated e-Science infrastructures and application platforms. While the emergence of Cloud computing as a new computing paradigm has provided new directions and opportunities for e-Science infrastructure development, it also presents some challenges. Scientific research is increasingly finding that it is difficult to handle “big data” using traditional data processing techniques. Such challenges demonstrate the need for a comprehensive analysis on using the above-mentioned informatics techniques to develop appropriate e-Science infrastructure and platforms in the context of Cloud computing. This survey paper describes recent research advances in applying informatics techniques to facilitate scientific research particularly from the Cloud computing perspective. Our particular contributions include identifying associated research challenges and opportunities, presenting lessons learned, and describing our future vision for applying Cloud computing to e-Science. We believe our research findings can help indicate the future trend of e-Science, and can inform funding and research directions in how to more appropriately employ computing technologies in scientific research. We point out the open research issues hoping to spark new development and innovation in the e-Science field.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Footnotes
1
The Diamond Light Source—http://​www.​diamond.​ac.​uk.
 
26
At the time of writing this paper, a paper for addressing this multi-step delegation problem is in preparation.
 
30
 
35
A web server returns a representation of a resource based on the HTTP-Accept header of a client request.
 
40
Advanced Climate Research Infrastructure for Data (ACRID)—http://​www.​cru.​uea.​ac.​uk/​cru/​projects/​acrid/​.
 
41
The Digital Object Identifier (DOI) System—http://​www.​doi.​org/​.
 
42
Open Archives Initiative Object Reuse and Exchange (OAI-ORE)—http://​www.​openarchives.​org/​ore/​.
 
43
Open Archives Initiative Object Reuse and Exchange http://​www.​openarchives.​org/​ore/​.
 
44
W3C Provenance Working Group http://​www.​w3.​org/​2011/​prov/​wiki/​Main_​, Page accessed 18 Dec 2011.
 
45
Semantic Publishing and Referencing Ontologies (SPAR) http://​purl.​org/​spar/​page. Accessed 18 Dec 2011.
 
48
Open Annotation Collaboration http://​www.​openannotation.​org/​.
 
51
Persistent Uniform Resource Locators http://​purl.​oclc.​org/​docs/​index.​html.
 
52
Digital Object Identifier http://​www.​doi.​org/​.
 
53
The Friend of a Friend (FOAF) vocabulary—http://​xmlns.​com/​foaf/​spec/​.
 
54
Semantically Interlinked Online Communities (SIOC)—http://​sioc-project.​org/​ontology.
 
55
Simple Knowledge Organization System Reference (SKOS)—http://​www.​w3.​org/​TR/​swbp-skos-core-spec.
 
57
The Gene Ontology Project http://​www.​geneontology.​org/​.
 
60
JISC Biophysical Repositories in the Lab project (BRIL), http://​www.​jisc.​ac.​uk/​whatwedo/​programmes/​inf11/​digpres/​bril.
 
61
Eduserve Managed Hosting and Cloud, http://​www.​eduserv.​org.​uk/​hosting.
 
62
UK National Grid Service, http://​www.​ngs.​ac.​uk.
 
68
DuraCloud, DuraSpace, http://​duracloud.​org.
 
77
A. Kumbhare, Y. Simmhan, V. Prasanna, Designing a secure storage repository for sharing scientific datasets using public clouds, http://​ceng.​usc.​edu/​~simmhan/​pubs/​kumbhare-datacloud-2011.​pdf.
 
78
Personal data in the Cloud: a global survey of consumer attitudes. Fujitsu Research Institute: http://​www.​fujitsu.​com/​downloads/​SOL/​fai/​reports/​fujitsupersonald​ata-in-the-cloud.​pdf.
 
79
O. Qing Zhang, M. Kirchberg, R. K. L. Ko, B. S. Lee, How to track your data: the case for cloud computing provenance, HP Laboratories HPL-2012-11, http://​www.​hpl.​hp.​com/​techreports/​2012/​HPL-2012-11.​pdf.
 
81
EU FP7 project Contrail, http://​contrail-project.​eu/​.
 
91
EU-FP7 Project VENUS-C, http://​www.​venus-c.​eu.
 
92
Cloud Foundry, Open Source PaaS, http://​www.​cloudfoundry.​com.
 
96
European Commission e-Infrastructure, European Grid Initiative, http://​www.​egi.​eu.
 
Literature
2.
go back to reference Yang X, Wang L et al (2011) Guide to e-Science: next generation scientific research and discovery. Springer, BerlinCrossRef Yang X, Wang L et al (2011) Guide to e-Science: next generation scientific research and discovery. Springer, BerlinCrossRef
3.
go back to reference Hey AJG, Trefethen AE (2003) In: Berman F, Fox GC, Hey AJG (eds) The data deluge: an e-Science perspective, in grid computing–making the global infrastructure a reality. Wiley, New York, pp 809–824 Hey AJG, Trefethen AE (2003) In: Berman F, Fox GC, Hey AJG (eds) The data deluge: an e-Science perspective, in grid computing–making the global infrastructure a reality. Wiley, New York, pp 809–824
4.
go back to reference Sutter JP, Alcock SG, Sawhney KJS (2011) Automated in-situ optimization of bimorph mirrors at diamond light source. In: Proc. SPIE 8139, 813906. doi:10.1117/12.892719. Sutter JP, Alcock SG, Sawhney KJS (2011) Automated in-situ optimization of bimorph mirrors at diamond light source. In: Proc. SPIE 8139, 813906. doi:10.​1117/​12.​892719.
6.
go back to reference Zhang L, Zhang J, Cai H (2007) Services computing: core enabling technology of the modern services industry. Springer, New York Zhang L, Zhang J, Cai H (2007) Services computing: core enabling technology of the modern services industry. Springer, New York
7.
go back to reference Yang X, Dove M, Bruin R et al (2010) A service-oriented framework for running quantum mechanical simulation for material properties over grids. IEEE Trans Syst Man Cybern Part C Appl Rev 40(3) Yang X, Dove M, Bruin R et al (2010) A service-oriented framework for running quantum mechanical simulation for material properties over grids. IEEE Trans Syst Man Cybern Part C Appl Rev 40(3)
8.
9.
go back to reference Hamre T, Sandven S (2011) Open service network for marine environmental data. EuroGOOS, Sopot Hamre T, Sandven S (2011) Open service network for marine environmental data. EuroGOOS, Sopot
10.
go back to reference Browdy SF (2011) GEOSS common infrastructure: internal structure and standards. GeoViQua First Workshop, Barcelona Browdy SF (2011) GEOSS common infrastructure: internal structure and standards. GeoViQua First Workshop, Barcelona
11.
go back to reference Yang X, Dove M, Bruin R, Walkingshaw A, Sinclair R, Wilson DJ, Murray-Rust P (2012) An e-Science data infrastructure for simulations within grid computing environment: methods, approaches, and practice. Concurr Comput Pract Exp. Yang X, Dove M, Bruin R, Walkingshaw A, Sinclair R, Wilson DJ, Murray-Rust P (2012) An e-Science data infrastructure for simulations within grid computing environment: methods, approaches, and practice. Concurr Comput Pract Exp.
12.
go back to reference Yang X (2011) QoS-oriented service computing: bring SOA into cloud environment. In: Liu X, Li Y (eds) Advanced design approaches to emerging software systems: principles, methodology and tools. IGI Global USA Yang X (2011) QoS-oriented service computing: bring SOA into cloud environment. In: Liu X, Li Y (eds) Advanced design approaches to emerging software systems: principles, methodology and tools. IGI Global USA
13.
go back to reference Zhang S, Wang W, Wu H, Vasilakos AV, Liu P (2013) Towards transparent and distributed workload management for large scale web servers. Future Generation Comp Syst 29(4):913–925CrossRef Zhang S, Wang W, Wu H, Vasilakos AV, Liu P (2013) Towards transparent and distributed workload management for large scale web servers. Future Generation Comp Syst 29(4):913–925CrossRef
14.
15.
go back to reference Ludäscher B, Altintas I, Berkley C, Higgins D, Jaeger E, Jones M, Lee E, Tao J, Zhao Y (2005) Scientific workflow management and the Kepler system. Concurr Comput Pract Exp 18(10):1039–1065CrossRef Ludäscher B, Altintas I, Berkley C, Higgins D, Jaeger E, Jones M, Lee E, Tao J, Zhao Y (2005) Scientific workflow management and the Kepler system. Concurr Comput Pract Exp 18(10):1039–1065CrossRef
16.
go back to reference Oinn T, Addis M, Ferris J, Marvin D, Senger M, Greenwood M, Carver T, Glover K, Pocock MR, Wipat A, Li P (2004) Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17):3045–3054, Oxford University Press, London. Oinn T, Addis M, Ferris J, Marvin D, Senger M, Greenwood M, Carver T, Glover K, Pocock MR, Wipat A, Li P (2004) Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17):3045–3054, Oxford University Press, London.
17.
go back to reference Taylor I, Shields M, Wang I, Harrison A (2007) The Triana workflow environment: architecture and applications. In: Taylor I, Deelman E, Gannon D, Shields M (eds) Workflows for e-Science. Springer, New York, pp 320–339CrossRef Taylor I, Shields M, Wang I, Harrison A (2007) The Triana workflow environment: architecture and applications. In: Taylor I, Deelman E, Gannon D, Shields M (eds) Workflows for e-Science. Springer, New York, pp 320–339CrossRef
18.
go back to reference Deelman E, Mehta G, Singh G, Su M, Vahi K (2007) Pegasus: mapping large-scale workflows to distributed resources. In: Taylor I, Deelman E, Gannon D, Shields M (eds) Workflows for e-Science. Springer, New York, pp 376–394CrossRef Deelman E, Mehta G, Singh G, Su M, Vahi K (2007) Pegasus: mapping large-scale workflows to distributed resources. In: Taylor I, Deelman E, Gannon D, Shields M (eds) Workflows for e-Science. Springer, New York, pp 376–394CrossRef
19.
go back to reference Fahringer T, Jugravu A, Pllana S, Prodan R, Seragiotto Jr, C, Truong H (2005) ASKALON: a tool set for cluster and Grid computing. Concurr Comput Pract Exp 17(2–4):143–169, Wiley InterScience. Fahringer T, Jugravu A, Pllana S, Prodan R, Seragiotto Jr, C, Truong H (2005) ASKALON: a tool set for cluster and Grid computing. Concurr Comput Pract Exp 17(2–4):143–169, Wiley InterScience.
20.
go back to reference Zhao Y, Hategan M, Clifford B, Foster I, von Laszewski G, Nefedova V, Raicu I, Stef-Praun T, Wilde M (2007) Swift: fast, reliable, loosely coupled parallel computation. Proceedings of 2007 IEEE congress on services (Services 2007), pp 199–206. Zhao Y, Hategan M, Clifford B, Foster I, von Laszewski G, Nefedova V, Raicu I, Stef-Praun T, Wilde M (2007) Swift: fast, reliable, loosely coupled parallel computation. Proceedings of 2007 IEEE congress on services (Services 2007), pp 199–206.
21.
go back to reference Yang X, Bruin R, Dove M (2010) Developing an end-to-end scientific workflow: a case study of using a reliable, lightweight, and comprehensive workflow platform in e-Science. doi:10.1109/MCSE.2009.211. Yang X, Bruin R, Dove M (2010) Developing an end-to-end scientific workflow: a case study of using a reliable, lightweight, and comprehensive workflow platform in e-Science. doi:10.​1109/​MCSE.​2009.​211.
22.
go back to reference Ludäscher B, Altintas I, Bowers S, Cummings J, Critchlow T, Deelman E, Roure DD, Freire J, Goble C, Jones M, Klasky S, McPhillips T, Podhorszki N, Silva C, Taylor I, Vouk M (2009) Scientific process automation and workflow management. In Shoshani A, Rotem D (eds) Scientific data management: challenges, existing technology, and deployment, computational science series. Chapman & Hall/CRC, pp 476–508. Ludäscher B, Altintas I, Bowers S, Cummings J, Critchlow T, Deelman E, Roure DD, Freire J, Goble C, Jones M, Klasky S, McPhillips T, Podhorszki N, Silva C, Taylor I, Vouk M (2009) Scientific process automation and workflow management. In Shoshani A, Rotem D (eds) Scientific data management: challenges, existing technology, and deployment, computational science series. Chapman & Hall/CRC, pp 476–508.
23.
go back to reference Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-Science: an overview of workflow system features and capabilities. Future Gener Comput Syst 25(5):528–540CrossRef Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-Science: an overview of workflow system features and capabilities. Future Gener Comput Syst 25(5):528–540CrossRef
24.
go back to reference Taylor I, Deelman E, Gannon D, Shields M (eds) (2007) Workflows for e-Science. Springer, New York, ISBN: 978-1-84628-519-6. Taylor I, Deelman E, Gannon D, Shields M (eds) (2007) Workflows for e-Science. Springer, New York, ISBN: 978-1-84628-519-6.
25.
go back to reference Yu Y (2006) Buyya R (2006) A taxonomy of workflow management systems for grid computing. J Grid Comput 3:171–200CrossRef Yu Y (2006) Buyya R (2006) A taxonomy of workflow management systems for grid computing. J Grid Comput 3:171–200CrossRef
26.
go back to reference Wang J, Korambath P, Kim S, Johnson S, Jin K, Crawl D, Altintas I, Smallen S, Labate B, Houk KN (2011) Facilitating e-science discovery using scientific workflows on the grid. In: Yang X, Wang L, Jie W (eds) Guide to e-Science: next generation scientific research and discovery. Springer, Berlin, pp 353–382. ISBN 978-0-85729-438-8CrossRef Wang J, Korambath P, Kim S, Johnson S, Jin K, Crawl D, Altintas I, Smallen S, Labate B, Houk KN (2011) Facilitating e-science discovery using scientific workflows on the grid. In: Yang X, Wang L, Jie W (eds) Guide to e-Science: next generation scientific research and discovery. Springer, Berlin, pp 353–382. ISBN 978-0-85729-438-8CrossRef
27.
go back to reference MacLennan, BJ (1992) Functional programming: practice and theory. Addison-Wesley. MacLennan, BJ (1992) Functional programming: practice and theory. Addison-Wesley.
28.
go back to reference Plale B, Gannon D, Reed DA, Graves SJ, Droegemeier K, Wilhelmson R, Ramamurthy M (2005) Towards dynamically adaptive weather analysis and forecasting in LEAD. In: International conference on computational science (2), pp 624–631. Plale B, Gannon D, Reed DA, Graves SJ, Droegemeier K, Wilhelmson R, Ramamurthy M (2005) Towards dynamically adaptive weather analysis and forecasting in LEAD. In: International conference on computational science (2), pp 624–631.
29.
go back to reference Wang J, Crawl D, Altintas I (2012) A framework for distributed data-parallel execution in the Kepler scientific workflow system. In: Proceedings of 1st international workshop on advances in the Kepler scientific workflow system and its applications at ICCS 2012 conference. Wang J, Crawl D, Altintas I (2012) A framework for distributed data-parallel execution in the Kepler scientific workflow system. In: Proceedings of 1st international workshop on advances in the Kepler scientific workflow system and its applications at ICCS 2012 conference.
30.
go back to reference Islam M, Huang A, Battisha M, Chiang M, Srinivasan S, Peters C, Neumann A, Abdelnur A (2012) Oozie: towards a scalable workflow management system for hadoop. In: Proceedings of the 1st international workshop on scalable workflow enactment engines and technologies (SWEET’12). Islam M, Huang A, Battisha M, Chiang M, Srinivasan S, Peters C, Neumann A, Abdelnur A (2012) Oozie: towards a scalable workflow management system for hadoop. In: Proceedings of the 1st international workshop on scalable workflow enactment engines and technologies (SWEET’12).
31.
go back to reference El-Rewini H, Lewis T, Ali H (1994) Task scheduling in parallel and distributed systems. PTR Prentice Hall, ISBN: 0-13-099235-6. El-Rewini H, Lewis T, Ali H (1994) Task scheduling in parallel and distributed systems. PTR Prentice Hall, ISBN: 0-13-099235-6.
32.
go back to reference Yu J, Buyya R, Ramamohanarao K (2008) Workflow scheduling algorithms for grid computing. In: Xhafa F, Abraham A (eds) Metaheuristics for scheduling in distributed computing environments. Springer, Berlin, pp 173–214. ISBN 978-3-540-69260-7CrossRef Yu J, Buyya R, Ramamohanarao K (2008) Workflow scheduling algorithms for grid computing. In: Xhafa F, Abraham A (eds) Metaheuristics for scheduling in distributed computing environments. Springer, Berlin, pp 173–214. ISBN 978-3-540-69260-7CrossRef
33.
go back to reference Dong F, Akl S (2006) Scheduling algorithms for grid computing: state of the art and open problems, Technical Report 2006–504. Queen’s University. Dong F, Akl S (2006) Scheduling algorithms for grid computing: state of the art and open problems, Technical Report 2006–504. Queen’s University.
34.
go back to reference Wieczorek M, Prodan R, Fahringer T (2005) Scheduling of scientific workflows in the ASKALON grid environment. SIGMOD Record 34(3):56–62CrossRef Wieczorek M, Prodan R, Fahringer T (2005) Scheduling of scientific workflows in the ASKALON grid environment. SIGMOD Record 34(3):56–62CrossRef
35.
go back to reference Wang J, Korambath P, Altintas I, Davis J, Crawl D (2014) Workflow as a service in the cloud: architecture and scheduling algorithms. In: Proceedings of international conference on computational science (ICCS 2014). Wang J, Korambath P, Altintas I, Davis J, Crawl D (2014) Workflow as a service in the cloud: architecture and scheduling algorithms. In: Proceedings of international conference on computational science (ICCS 2014).
36.
go back to reference Vazirani VV (2003) Approximation algorithms. Springer, Berlin. ISBN 3-540-65367-8CrossRef Vazirani VV (2003) Approximation algorithms. Springer, Berlin. ISBN 3-540-65367-8CrossRef
37.
go back to reference Morton T, Pentico DW (1993) Heuristic scheduling systems: with applications to production systems and project management. Wiley, New York. ISBN 0-471-57819-3 Morton T, Pentico DW (1993) Heuristic scheduling systems: with applications to production systems and project management. Wiley, New York. ISBN 0-471-57819-3
38.
go back to reference Kosar T, Balman M (2009) A new paradigm: data-aware scheduling in grid computing. Future Gener Comput Syst 25(4):406–413CrossRef Kosar T, Balman M (2009) A new paradigm: data-aware scheduling in grid computing. Future Gener Comput Syst 25(4):406–413CrossRef
39.
go back to reference Yuan D, Yang Y, Liu X, Zhang G, Chen J (2012) A data dependency based strategy for intermediate data storage in scientific cloud workflow systems. Concurr Comput Pract Exp 24(9):956–976CrossRef Yuan D, Yang Y, Liu X, Zhang G, Chen J (2012) A data dependency based strategy for intermediate data storage in scientific cloud workflow systems. Concurr Comput Pract Exp 24(9):956–976CrossRef
40.
go back to reference Viana V, de Oliveira D, Mattoso M (2011) Towards a cost model for scheduling scientific workflows activities in cloud environments. IEEE World Congress on Services, pp 216–219. Viana V, de Oliveira D, Mattoso M (2011) Towards a cost model for scheduling scientific workflows activities in cloud environments. IEEE World Congress on Services, pp 216–219.
41.
go back to reference Kllapi H, Sitaridi E, Tsangaris MM, Ioannidis YE (2011) Schedule optimization for data processing flows on the Cloud. In: SIGMOD conference, pp 289–300. Kllapi H, Sitaridi E, Tsangaris MM, Ioannidis YE (2011) Schedule optimization for data processing flows on the Cloud. In: SIGMOD conference, pp 289–300.
43.
go back to reference Karasavvas K, Wolstencroft K, Mina E, Cruickshank D, Williams A, De Roure D, Goble C, Roos M (2012) Opening new gateways to workflows for life scientists. In: Gesing S et al. (eds) HealthGrid applications and technologies meet science gateways for life sciences. IOS Press, pp 131–141. Karasavvas K, Wolstencroft K, Mina E, Cruickshank D, Williams A, De Roure D, Goble C, Roos M (2012) Opening new gateways to workflows for life scientists. In: Gesing S et al. (eds) HealthGrid applications and technologies meet science gateways for life sciences. IOS Press, pp 131–141.
44.
go back to reference Terstyanszky G, Kukla T, Kiss T, Kacsuk P, Balasko A, Farkas Z (2014) Enabling scientific workflow sharing through coarse-grained interoperability. Future Gener Comput Syst 37:46–59, ISSN 0167–739X. doi:10.1016/j.future.2014.02.016. Terstyanszky G, Kukla T, Kiss T, Kacsuk P, Balasko A, Farkas Z (2014) Enabling scientific workflow sharing through coarse-grained interoperability. Future Gener Comput Syst 37:46–59, ISSN 0167–739X. doi:10.​1016/​j.​future.​2014.​02.​016.
45.
go back to reference Plankensteiner K, Montagnat J, Prodan R (2011) IWIR: a language enabling portability across grid workflow systems. In: Proceedings of workshop on workflows in support of large-scale science (WORKS’11), Seattle. doi:10.1145/2110497.2110509. Plankensteiner K, Montagnat J, Prodan R (2011) IWIR: a language enabling portability across grid workflow systems. In: Proceedings of workshop on workflows in support of large-scale science (WORKS’11), Seattle. doi:10.​1145/​2110497.​2110509.
46.
go back to reference Simmhan YL, Plale B, Gannon D (2005) A survey of data provenance in e-Science. SIGMOD Record 34(3):31–36CrossRef Simmhan YL, Plale B, Gannon D (2005) A survey of data provenance in e-Science. SIGMOD Record 34(3):31–36CrossRef
47.
go back to reference Ikeda R, Park H, Widom J (2011) Provenance for generalized map and reduce workflows. In: Proceedings of CIDR’2011, pp 273–283. Ikeda R, Park H, Widom J (2011) Provenance for generalized map and reduce workflows. In: Proceedings of CIDR’2011, pp 273–283.
48.
go back to reference Crawl D, Wang J, Altintas I (2011) Provenance for mapreduce-based data-intensive workflows. In: Proceedings of the 6th workshop on workflows in support of large-scale science (WORKS11) at supercomputing 2011 (SC2011) conference, pp 21–29. Crawl D, Wang J, Altintas I (2011) Provenance for mapreduce-based data-intensive workflows. In: Proceedings of the 6th workshop on workflows in support of large-scale science (WORKS11) at supercomputing 2011 (SC2011) conference, pp 21–29.
49.
go back to reference Muniswamy-Reddy K, Macko P, Seltzer M (2010) Provenance for the cloud. In: Proceedings of the 8th conference on file and storage technologies (FAST’10), The USENIX Association. Muniswamy-Reddy K, Macko P, Seltzer M (2010) Provenance for the cloud. In: Proceedings of the 8th conference on file and storage technologies (FAST’10), The USENIX Association.
50.
go back to reference Foster I, Zhao Y, Raicu I, Lu S (2008) Cloud computing and grid computing 360-degree compared. In: Grid computing environments workshop, 2008 (GCE’08), pp 1–10. Foster I, Zhao Y, Raicu I, Lu S (2008) Cloud computing and grid computing 360-degree compared. In: Grid computing environments workshop, 2008 (GCE’08), pp 1–10.
52.
go back to reference Chang W-L, Vasilakos AV (2014) Molecular Computing: Towards A Novel Computing Architecture for Complex Problem Solving. Springer, March 2014 (Book in Big Data Series). Chang W-L, Vasilakos AV (2014) Molecular Computing: Towards A Novel Computing Architecture for Complex Problem Solving. Springer, March 2014 (Book in Big Data Series).
54.
go back to reference Wang J, Crawl D, Altintas I, Li W (2014) Big data applications using workflows for data parallel computing. IEEE Comput Sci Eng. Wang J, Crawl D, Altintas I, Li W (2014) Big data applications using workflows for data parallel computing. IEEE Comput Sci Eng.
55.
go back to reference Dean J, Ghemawat S, Mapreduce S (2008) Simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef Dean J, Ghemawat S, Mapreduce S (2008) Simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef
56.
go back to reference Moretti C, Bui H, Hollingsworth K, Rich B, Flynn P, Thain D (2010) All-pairs: an abstraction for data-intensive computing on campus Grids. IEEE Trans Parallel Distrib Syst 21:33–46CrossRef Moretti C, Bui H, Hollingsworth K, Rich B, Flynn P, Thain D (2010) All-pairs: an abstraction for data-intensive computing on campus Grids. IEEE Trans Parallel Distrib Syst 21:33–46CrossRef
57.
go back to reference Gu Y, Grossman R (2009) Sector and sphere: the design and implementation of a high performance data Cloud. Philos Trans R Soc A 367(1897):2429–2445CrossRef Gu Y, Grossman R (2009) Sector and sphere: the design and implementation of a high performance data Cloud. Philos Trans R Soc A 367(1897):2429–2445CrossRef
58.
go back to reference Gropp W, Lusk E, Skjellum A (1999) Using MPI: portable parallel programming with the message passing interface, 2nd edn. MIT Press, Cambridge, Scientific and Engineering Computation Series Gropp W, Lusk E, Skjellum A (1999) Using MPI: portable parallel programming with the message passing interface, 2nd edn. MIT Press, Cambridge, Scientific and Engineering Computation Series
59.
go back to reference Chapman B, Jost G, van der Pas R, Kuck D (2007) Using OpenMP: portable shared memory parallel programming. The MIT Press, Cambridge Chapman B, Jost G, van der Pas R, Kuck D (2007) Using OpenMP: portable shared memory parallel programming. The MIT Press, Cambridge
60.
go back to reference Schatz M (2009) Cloudburst: highly sensitive read mapping with mapreduce. Bioinformatics 25(11):1363–1369CrossRef Schatz M (2009) Cloudburst: highly sensitive read mapping with mapreduce. Bioinformatics 25(11):1363–1369CrossRef
61.
go back to reference Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL (2009) Searching for snps with Cloud computing. Genome Biol 10(134) Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL (2009) Searching for snps with Cloud computing. Genome Biol 10(134)
62.
go back to reference Kalyanaraman A, Cannon WR, Latt B, Baxter DJ (2011) MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification. Bioinformatics, Advance online access. doi:10.1093/bioinformatics/btr523 Kalyanaraman A, Cannon WR, Latt B, Baxter DJ (2011) MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification. Bioinformatics, Advance online access. doi:10.​1093/​bioinformatics/​btr523
63.
go back to reference Dahiphale D, Karve R, Vasilakos AV, Liu H, Yu Z, Chhajer A, Wang J, Wang C (2014) An advanced mapreduce:cloud mapreduce, enhancements and applications. IEEE Trans Netw Serv Manag 11(1):101–115CrossRef Dahiphale D, Karve R, Vasilakos AV, Liu H, Yu Z, Chhajer A, Wang J, Wang C (2014) An advanced mapreduce:cloud mapreduce, enhancements and applications. IEEE Trans Netw Serv Manag 11(1):101–115CrossRef
64.
go back to reference Wang J, Crawl D, Altintas I (2009) Kepler + Hadoop: a general architecture facilitating data-intensive applications in scientific workflow systems. In: Proceedings of the 4th workshop on workflows in support of large-scale science (WORKS09) at supercomputing 2009 (SC2009) conference. ACM, ISBN 978-1-60558-717-2. Wang J, Crawl D, Altintas I (2009) Kepler + Hadoop: a general architecture facilitating data-intensive applications in scientific workflow systems. In: Proceedings of the 4th workshop on workflows in support of large-scale science (WORKS09) at supercomputing 2009 (SC2009) conference. ACM, ISBN 978-1-60558-717-2.
65.
go back to reference Zhang C, Sterck HD (2009) CloudWF: a computational workflow system for clouds based on hadoop. In: Proceedings of the 1st international conference on cloud computing (CloudCom 2009). Zhang C, Sterck HD (2009) CloudWF: a computational workflow system for clouds based on hadoop. In: Proceedings of the 1st international conference on cloud computing (CloudCom 2009).
66.
go back to reference Fei X, Lu S, Lin C (2009) A mapreduce-enabled scientific workflow composition framework. In: Proceedings of 2009 IEEE international conference on web services (ICWS 2009), pp 663–670. Fei X, Lu S, Lin C (2009) A mapreduce-enabled scientific workflow composition framework. In: Proceedings of 2009 IEEE international conference on web services (ICWS 2009), pp 663–670.
67.
go back to reference Olston C, Chiou G, Chitnis L, Liu F, Han Y, Larsson M, Neumann A, Rao VBN, Sankarasubramanian V, Seth S, Tian C, ZiCornell T, Wang X (2011) Nova: continuous pig/hadoop workflows. ACM SIGMOD 2011 international conference on management of data (Industrial Track), Athens. Olston C, Chiou G, Chitnis L, Liu F, Han Y, Larsson M, Neumann A, Rao VBN, Sankarasubramanian V, Seth S, Tian C, ZiCornell T, Wang X (2011) Nova: continuous pig/hadoop workflows. ACM SIGMOD 2011 international conference on management of data (Industrial Track), Athens.
69.
70.
go back to reference Basney J, Gaynor J (2011) An oauth service for issuing certificates to science gateways for teragrid users. TeraGrid ‘11, Salt Lake City. Basney J, Gaynor J (2011) An oauth service for issuing certificates to science gateways for teragrid users. TeraGrid ‘11, Salt Lake City.
72.
go back to reference Baker CJO, Cheung K-H (eds) (2006) Semantic Web: Revolutionizing knowledge discovery in the life sciences. Baker CJO, Cheung K-H (eds) (2006) Semantic Web: Revolutionizing knowledge discovery in the life sciences.
79.
go back to reference Bechhofer S, Ainsworth J, Bhagat J, Buchan I, Couch P, Cruickshank D, Delderfield M, Dunlop I, Gamble M, Goble C, Michaelides D, Missier P, Owen S, Newman D, De Roure S, Sufi S (2010) Why linked data is not enough for scientists. In: Proceedings of the 6th IEEE e-Science conference, Brisbane. Bechhofer S, Ainsworth J, Bhagat J, Buchan I, Couch P, Cruickshank D, Delderfield M, Dunlop I, Gamble M, Goble C, Michaelides D, Missier P, Owen S, Newman D, De Roure S, Sufi S (2010) Why linked data is not enough for scientists. In: Proceedings of the 6th IEEE e-Science conference, Brisbane.
84.
go back to reference Foster I, Kesselman C (eds) The grid: blueprint for a new computing infrastructure. Morgan Kaufmann, ISBN 1-55860-475-8 Foster I, Kesselman C (eds) The grid: blueprint for a new computing infrastructure. Morgan Kaufmann, ISBN 1-55860-475-8
85.
go back to reference Fitzgerald S (2003) Grid information services for distributed resource sharing. In: Proceedings of the 10th IEEE international symposium on high performance distributed computing. Fitzgerald S (2003) Grid information services for distributed resource sharing. In: Proceedings of the 10th IEEE international symposium on high performance distributed computing.
86.
go back to reference Laure E, Fisher SM, Frohner A, Grandi C, Kunszt P (2006) Programming the grid with gLite. Comput Methods Sci Technol 12(1):33–45CrossRef Laure E, Fisher SM, Frohner A, Grandi C, Kunszt P (2006) Programming the grid with gLite. Comput Methods Sci Technol 12(1):33–45CrossRef
87.
go back to reference Romberg M (2002) The UNICORE grid infrastructure. J Sci Program Arch 10(2). IOS Press Amsterdam. Romberg M (2002) The UNICORE grid infrastructure. J Sci Program Arch 10(2). IOS Press Amsterdam.
88.
go back to reference Risch M, Altmann J, Guo L, Fleming A, Courcoubetis C (2009) The GridEcon platform: a business scenario testbed for commercial cloud services. In: Grid economics and business models. LNCS, vol 5745/2009. Springer, Berlin. Risch M, Altmann J, Guo L, Fleming A, Courcoubetis C (2009) The GridEcon platform: a business scenario testbed for commercial cloud services. In: Grid economics and business models. LNCS, vol 5745/2009. Springer, Berlin.
89.
go back to reference Toni F, Morge M et al. (2008) The ArguGrid platform: an overview. In: Grid economics and business models. LNCS, vol 5206/2008. Springer, Berlin. Toni F, Morge M et al. (2008) The ArguGrid platform: an overview. In: Grid economics and business models. LNCS, vol 5206/2008. Springer, Berlin.
90.
go back to reference Wei G, Vasilakos AV, Zheng Y, Xiong N (2010) A game-theoretic method of fair resource allocation for cloud computing services. J Supercomput 54(2):252–269CrossRef Wei G, Vasilakos AV, Zheng Y, Xiong N (2010) A game-theoretic method of fair resource allocation for cloud computing services. J Supercomput 54(2):252–269CrossRef
91.
go back to reference Dustdar S, Guo Y, Satzger B, Truong HL (2011) Principles of elastic processes. IEEE Internet Comput 15(5):66–71CrossRef Dustdar S, Guo Y, Satzger B, Truong HL (2011) Principles of elastic processes. IEEE Internet Comput 15(5):66–71CrossRef
92.
go back to reference Guo L, Guo Y, Tian X (2010) IC cloud: a design space for composable cloud computing. In: Proceedings of IEEE cloud computing, Miami. Guo L, Guo Y, Tian X (2010) IC cloud: a design space for composable cloud computing. In: Proceedings of IEEE cloud computing, Miami.
93.
go back to reference Duan Q, Yan Y, Vasilakos AV (2012) A Survey on Service-Oriented Network Virtualization Toward Convergence of Networking and Cloud Computing. Network and Service Management, IEEE Transactions, 9(4):373–392, 10 Dec 2012. Duan Q, Yan Y, Vasilakos AV (2012) A Survey on Service-Oriented Network Virtualization Toward Convergence of Networking and Cloud Computing. Network and Service Management, IEEE Transactions, 9(4):373–392, 10 Dec 2012.
94.
go back to reference Xu F, Liu F, Jin H, Vasilakos AV (2014) Managing Performance Overhead of Virtual Machines in Cloud Computing: A Survey, State of the Art, and Future Directions. Proceedings of the IEEE, 102(1):11–31, 17 Dec 2013. Xu F, Liu F, Jin H, Vasilakos AV (2014) Managing Performance Overhead of Virtual Machines in Cloud Computing: A Survey, State of the Art, and Future Directions. Proceedings of the IEEE, 102(1):11–31, 17 Dec 2013.
95.
go back to reference Wang J, Korambath P, Altintas I (2011) A physical and virtual compute cluster resource load balancing approach to data-parallel scientific workflow scheduling. In: Proceedings of IEEE 2011 fifth international workshop on scientific workflows (SWF 2011), at 2011 congress on services (Services 2011), pp 212–215. Wang J, Korambath P, Altintas I (2011) A physical and virtual compute cluster resource load balancing approach to data-parallel scientific workflow scheduling. In: Proceedings of IEEE 2011 fifth international workshop on scientific workflows (SWF 2011), at 2011 congress on services (Services 2011), pp 212–215.
96.
go back to reference Chadwick K et al. (2012) FermiGrid and FermiCloud update. International symposium on grids and clouds 2012 (ISGC 2012), Taipei. Chadwick K et al. (2012) FermiGrid and FermiCloud update. International symposium on grids and clouds 2012 (ISGC 2012), Taipei.
97.
go back to reference Schaffer HE, Averitt SF, Hoit MI, Peeler A, Sills ED, Vouk MA (2009) NCSU’s virtual computing lab: a Cloud computing solution. Computer 42(7):94–97CrossRef Schaffer HE, Averitt SF, Hoit MI, Peeler A, Sills ED, Vouk MA (2009) NCSU’s virtual computing lab: a Cloud computing solution. Computer 42(7):94–97CrossRef
98.
go back to reference Berriman GB, Deelman E, Juve G, Rynge M, Vöckler JS (1983) The application of cloud computing to scientific workflows: a study of cost and performance. Philos Trans R Soc A Math Phys Eng Sci 371:2013 Berriman GB, Deelman E, Juve G, Rynge M, Vöckler JS (1983) The application of cloud computing to scientific workflows: a study of cost and performance. Philos Trans R Soc A Math Phys Eng Sci 371:2013
101.
go back to reference Jensen J, Downing R, Waddington S, Hedges M, Zhang J, Knight G (2011) Kindura–federating data clouds for archiving. In: Proceedings of international symposium on grids and clouds. Jensen J, Downing R, Waddington S, Hedges M, Zhang J, Knight G (2011) Kindura–federating data clouds for archiving. In: Proceedings of international symposium on grids and clouds.
102.
go back to reference Hedges M, Hasan A. Blanke T (2007) Management and preservation of research data with iRODS. In: Proceedings of the ACM first workshop on CyberInfrastructure: information management in e-Science. doi:10.1145/1317353.1317358. Hedges M, Hasan A. Blanke T (2007) Management and preservation of research data with iRODS. In: Proceedings of the ACM first workshop on CyberInfrastructure: information management in e-Science. doi:10.​1145/​1317353.​1317358.
103.
go back to reference Moore RW, Wan M, Rajasekar A (2005) Storage resource broker; generic software infrastructure for managing globally distributed data. In: Proceedings of local to global data interoperability–challenges and technologies, Sardinia. doi:10.1109/LGDI.2005.1612467. Moore RW, Wan M, Rajasekar A (2005) Storage resource broker; generic software infrastructure for managing globally distributed data. In: Proceedings of local to global data interoperability–challenges and technologies, Sardinia. doi:10.​1109/​LGDI.​2005.​1612467.
104.
go back to reference Chine K (2010) Open science in the cloud: towards a universal platform for scientific and statistical computing, handbook of cloud computing, part 4, pp 453–474. Chine K (2010) Open science in the cloud: towards a universal platform for scientific and statistical computing, handbook of cloud computing, part 4, pp 453–474.
106.
go back to reference Schatz MC, Langmead B, Salzberg SL (2010 July) Cloud computing and the DNA data race. Nat Biotechnol 28(7):691–693 Schatz MC, Langmead B, Salzberg SL (2010 July) Cloud computing and the DNA data race. Nat Biotechnol 28(7):691–693
110.
go back to reference Loutas N, Peristeras V, Bouras T, Kamateri E, Zeginis D, Tarabanis K (2010) Towards a reference architecture for semantically interoperable clouds. 2010 IEEE second international conference on cloud computing technology and science, pp 143–150. Loutas N, Peristeras V, Bouras T, Kamateri E, Zeginis D, Tarabanis K (2010) Towards a reference architecture for semantically interoperable clouds. 2010 IEEE second international conference on cloud computing technology and science, pp 143–150.
111.
go back to reference Andreozzi S, Burke S, Ehm F, Field L, Galang G, Konya B, Litmaath M, Millar P, Navarro JP (2009) GLUE Specification v. 2.0 (ANL). Andreozzi S, Burke S, Ehm F, Field L, Galang G, Konya B, Litmaath M, Millar P, Navarro JP (2009) GLUE Specification v. 2.0 (ANL).
112.
go back to reference Ruiz-Alvarez A, Humphrey M (2011) A model and decision procedure for data storage in Cloud computing. ScienceCloud’11, San Jose. Ruiz-Alvarez A, Humphrey M (2011) A model and decision procedure for data storage in Cloud computing. ScienceCloud’11, San Jose.
117.
go back to reference Yang X, Blower JD, Bastin L, Lush V, Zabala A, Maso J, Cornford D, Diaz P, Lumsden J (2012) An integrated view of data quality in earth observation. Philos Trans R Soc A. doi:10.1098/rsta.2012.0072 Yang X, Blower JD, Bastin L, Lush V, Zabala A, Maso J, Cornford D, Diaz P, Lumsden J (2012) An integrated view of data quality in earth observation. Philos Trans R Soc A. doi:10.​1098/​rsta.​2012.​0072
118.
go back to reference Wei L, Zhu H, Cao Z, Jia W, Vasilakos AV (2010) SecCloud: Bridging Secure Storage and Computation in Cloud. Distributed Computing Systems Workshops (ICDCSW), 2010 IEEE 30th International Conference, IEEE, Genova, 21–25 June 2010. Wei L, Zhu H, Cao Z, Jia W, Vasilakos AV (2010) SecCloud: Bridging Secure Storage and Computation in Cloud. Distributed Computing Systems Workshops (ICDCSW), 2010 IEEE 30th International Conference, IEEE, Genova, 21–25 June 2010.
119.
go back to reference Wei L, Zhu H, Cao Z, Dong X, Jia W, Chen Y, Vasilakos AV (2014) Security and privacy for storage and computation in cloud computing. Inf Sci 258:371–386CrossRef Wei L, Zhu H, Cao Z, Dong X, Jia W, Chen Y, Vasilakos AV (2014) Security and privacy for storage and computation in cloud computing. Inf Sci 258:371–386CrossRef
120.
go back to reference Bose R, Frew J (2005) Lineage retrieval for scientific data processing: a survey. ACM Comput Surv 37(1):1–28CrossRef Bose R, Frew J (2005) Lineage retrieval for scientific data processing: a survey. ACM Comput Surv 37(1):1–28CrossRef
121.
go back to reference Muniswamy-Reddy K-K, Braun U, Holland DA, Macko P, Maclean D, Margo D, Seltzer M, Smogor R (2009) Layering in provenance systems. In: Proc of the USENIX Technical Conf. USENIX Association, pp 129–142. Muniswamy-Reddy K-K, Braun U, Holland DA, Macko P, Maclean D, Margo D, Seltzer M, Smogor R (2009) Layering in provenance systems. In: Proc of the USENIX Technical Conf. USENIX Association, pp 129–142.
122.
go back to reference Muniswamy-Reddy K-K, Macko P, Seltzer MI (2009) Making a cloud provenance-aware. In: Cheney J (ed) First workshop on the theory and practice of provenance. USENIX, San Francisco Muniswamy-Reddy K-K, Macko P, Seltzer MI (2009) Making a cloud provenance-aware. In: Cheney J (ed) First workshop on the theory and practice of provenance. USENIX, San Francisco
125.
go back to reference Rellermeyer JS, Bagchi S (2012) Dependability as a cloud service–a modular approach. In: Dependable systems and networks workshops (DSN-W), 2012 IEEE/IFIP 42nd international conference. doi:10.1109/DSNW.2012.6264688. Rellermeyer JS, Bagchi S (2012) Dependability as a cloud service–a modular approach. In: Dependable systems and networks workshops (DSN-W), 2012 IEEE/IFIP 42nd international conference. doi:10.​1109/​DSNW.​2012.​6264688.
129.
go back to reference Bizer C, Heath T, Berners-Lee T (2009) Linked data–the story so far. Int J Semantic Web Inf Syst 5(3):1–22CrossRef Bizer C, Heath T, Berners-Lee T (2009) Linked data–the story so far. Int J Semantic Web Inf Syst 5(3):1–22CrossRef
130.
go back to reference Delbru R, Campinas S, Tummarello G (2011) Searching web data: an entity retrieval and high-performance indexing model. J Web Semantics. Delbru R, Campinas S, Tummarello G (2011) Searching web data: an entity retrieval and high-performance indexing model. J Web Semantics.
131.
go back to reference Rochwerger B, Breitgand D, Levy E, Galis A, Nagin K, Llorente IM, Montero R, Wolfsthal Y, Elmroth E, Caceres J, Ben-Yehuda M, Emmerich W, Gala F (2009) The reservoir model and architecture for open federated Cloud computing. IBM J Res Dev 53(4):1–11CrossRef Rochwerger B, Breitgand D, Levy E, Galis A, Nagin K, Llorente IM, Montero R, Wolfsthal Y, Elmroth E, Caceres J, Ben-Yehuda M, Emmerich W, Gala F (2009) The reservoir model and architecture for open federated Cloud computing. IBM J Res Dev 53(4):1–11CrossRef
133.
go back to reference He Q, Zhou S, Kobler B, Duffy D, McGlynn T (2010) Case study for running HPC applications in public clouds. In: Proceedings of the 19th ACM Lting. ACM, pp 395–401. He Q, Zhou S, Kobler B, Duffy D, McGlynn T (2010) Case study for running HPC applications in public clouds. In: Proceedings of the 19th ACM Lting. ACM, pp 395–401.
134.
go back to reference Bientinesi P, Iakymchuk R, Napper J (2010) HPC on competitive cloud resources. In: Handbook of cloud computing. Springer, pp 493–516. Bientinesi P, Iakymchuk R, Napper J (2010) HPC on competitive cloud resources. In: Handbook of cloud computing. Springer, pp 493–516.
135.
go back to reference Vouk MA, Sills E, Dreher P (2010) Integration of high-performance computing into cloud computing services. Handbook of cloud computing. Springer, US, pp 255–276CrossRef Vouk MA, Sills E, Dreher P (2010) Integration of high-performance computing into cloud computing services. Handbook of cloud computing. Springer, US, pp 255–276CrossRef
Metadata
Title
Cloud computing in e-Science: research challenges and opportunities
Authors
Xiaoyu Yang
David Wallom
Simon Waddington
Jianwu Wang
Arif Shaon
Brian Matthews
Michael Wilson
Yike Guo
Li Guo
Jon D. Blower
Athanasios V. Vasilakos
Kecheng Liu
Philip Kershaw
Publication date
01-10-2014
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 1/2014
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-014-1251-5

Other articles of this Issue 1/2014

The Journal of Supercomputing 1/2014 Go to the issue

Premium Partner