Skip to main content
Erschienen in: Knowledge and Information Systems 2/2014

01.11.2014 | Regular Paper

Security-aware intermediate data placement strategy in scientific cloud workflows

verfasst von: Wei Liu, Su Peng, Wei Du, Wei Wang, Guo Sun Zeng

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Massive computation power and storage capacity of cloud computing systems allow scientists to deploy data-intensive applications without the infrastructure investment, where large application datasets can be stored in the cloud. Based on the pay-as-you-go model, data placement strategies have been developed to cost-effectively store large volumes of generated datasets in the scientific cloud workflows. As promising as it is, this paradigm also introduces many new challenges for data security when the users outsource sensitive data for sharing on the cloud servers, which are not within the same trusted domain as the data owners. This challenge is further complicated by the security constraints on the potential sensitive data for the scientific workflows in the cloud. To effectively address this problem, we propose a security-aware intermediate data placement strategy. First, we build a security overhead model to reasonably measure the security overheads incurred by the sensitive data. Second, we develop a data placement strategy to dynamically place the intermediate data for the scientific workflows. Finally, our experimental results show that our strategy can effectively improve the intermediate data security while ensuring the data transfer time during the execution of scientific workflows.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Demchenko Y, Grosso P, de Laat C et al. (2013) Addressing big data issues in scientific data infrastructure. In: Proceedings of the international conference on collaboration technologies and systems, pp 48–55 Demchenko Y, Grosso P, de Laat C et al. (2013) Addressing big data issues in scientific data infrastructure. In: Proceedings of the international conference on collaboration technologies and systems, pp 48–55
4.
Zurück zum Zitat Sagiroglu S, Sinanc D (2013) Big data: a review. In: Proceedings of the international conference on collaboration technologies and systems, pp 42–47 Sagiroglu S, Sinanc D (2013) Big data: a review. In: Proceedings of the international conference on collaboration technologies and systems, pp 42–47
5.
Zurück zum Zitat Acar UA, Chen Y (2013) Streaming big data with self-adjusting computation. In: Proceedings of the workshop on data driven functional programming, pp 15–18 Acar UA, Chen Y (2013) Streaming big data with self-adjusting computation. In: Proceedings of the workshop on data driven functional programming, pp 15–18
6.
Zurück zum Zitat Baru C, Bhandarkar M, Nambiar R et al (2013) Setting the direction for big data benchmark standards. In: Nambiar R, Poess M (eds) Selected topics in performance evaluation and benchmarking. Lecture notes in computer science, Springer, Heidelberg, pp 197–208 Baru C, Bhandarkar M, Nambiar R et al (2013) Setting the direction for big data benchmark standards. In: Nambiar R, Poess M (eds) Selected topics in performance evaluation and benchmarking. Lecture notes in computer science, Springer, Heidelberg, pp 197–208
7.
Zurück zum Zitat Srivastava D, Dong XL (2013) Big data integration. In: Proceedings of the international conference on data engineering, pp 1245–1248 Srivastava D, Dong XL (2013) Big data integration. In: Proceedings of the international conference on data engineering, pp 1245–1248
8.
Zurück zum Zitat Fei X, Lu S (2012) A dataflow-based scientific workflow composition framework. IEEE Trans Serv Comput 5(1):45–58CrossRef Fei X, Lu S (2012) A dataflow-based scientific workflow composition framework. IEEE Trans Serv Comput 5(1):45–58CrossRef
9.
Zurück zum Zitat Szalay A, Gray J (2006) 2020 Computing: science in an exponential world. Nature 440(7083):413–414CrossRef Szalay A, Gray J (2006) 2020 Computing: science in an exponential world. Nature 440(7083):413–414CrossRef
10.
Zurück zum Zitat Deelman E, Gannon D, Shields M et al (2009) Workflows and e-Science: an overview of workflow system features and capabilities. Future Gener Comput Syst 25(5):528–540CrossRef Deelman E, Gannon D, Shields M et al (2009) Workflows and e-Science: an overview of workflow system features and capabilities. Future Gener Comput Syst 25(5):528–540CrossRef
11.
Zurück zum Zitat Yuan D, Yang Y, Liu X et al (2013) A highly practical approach towards achieving minimum datasets storage cost in the cloud. IEEE Trans Parallel Distrib Syst 24(6):1234–1244CrossRef Yuan D, Yang Y, Liu X et al (2013) A highly practical approach towards achieving minimum datasets storage cost in the cloud. IEEE Trans Parallel Distrib Syst 24(6):1234–1244CrossRef
12.
Zurück zum Zitat Yuan D, Yang Y, Liu X et al (2012) A data dependency based strategy for intermediate data storage in scientific cloud workflow systems. Concurr Comput Pract Exp 24(9):956–976CrossRef Yuan D, Yang Y, Liu X et al (2012) A data dependency based strategy for intermediate data storage in scientific cloud workflow systems. Concurr Comput Pract Exp 24(9):956–976CrossRef
13.
Zurück zum Zitat Bertram L, Ilkay A, Chad B et al (2006) Scientific workflow management and the Kepler system. Concurr Comput Pract Exp 18(10):1039–1065CrossRef Bertram L, Ilkay A, Chad B et al (2006) Scientific workflow management and the Kepler system. Concurr Comput Pract Exp 18(10):1039–1065CrossRef
14.
Zurück zum Zitat Weiss A (2007) Computing in the clouds. ACM Netw 11(4):16–25 Weiss A (2007) Computing in the clouds. ACM Netw 11(4):16–25
15.
Zurück zum Zitat Foster I, Yong Z, Raicu I et al (2008) Cloud computing and grid computing 360-degree compared. In: Proceedings of the grid computing environments workshop, pp 1–10 Foster I, Yong Z, Raicu I et al (2008) Cloud computing and grid computing 360-degree compared. In: Proceedings of the grid computing environments workshop, pp 1–10
16.
Zurück zum Zitat Yuan D, Yang Y, Liu X et al (2010) A data placement strategy in scientific cloud workflows. Future Gener Comput Syst 26(8):1200–1214CrossRef Yuan D, Yang Y, Liu X et al (2010) A data placement strategy in scientific cloud workflows. Future Gener Comput Syst 26(8):1200–1214CrossRef
17.
Zurück zum Zitat Wan C, Wang C, Pei J (2012) A QoS-awared scientific workflow scheduling schema in cloud computing. In: Proceedings of international conference on information science and technology, pp 634–639 Wan C, Wang C, Pei J (2012) A QoS-awared scientific workflow scheduling schema in cloud computing. In: Proceedings of international conference on information science and technology, pp 634–639
18.
Zurück zum Zitat Wei L, Zhu H, Cao Z et al (2014) Security and privacy for storage and computation in cloud computing. Inf Sci 258(2):371–386CrossRef Wei L, Zhu H, Cao Z et al (2014) Security and privacy for storage and computation in cloud computing. Inf Sci 258(2):371–386CrossRef
19.
Zurück zum Zitat Chu CK, Zhu WT, Han J et al (2013) Security concerns in popular cloud storage services. IEEE Pervasive Comput 12(4):50–57CrossRef Chu CK, Zhu WT, Han J et al (2013) Security concerns in popular cloud storage services. IEEE Pervasive Comput 12(4):50–57CrossRef
20.
Zurück zum Zitat Kalloniatis C, Mouratidis H, Islam S (2013) Evaluating cloud deployment scenarios based on security and privacy requirements. Requir Eng 18(4):299–319CrossRef Kalloniatis C, Mouratidis H, Islam S (2013) Evaluating cloud deployment scenarios based on security and privacy requirements. Requir Eng 18(4):299–319CrossRef
21.
Zurück zum Zitat Xiong L, Goryczka S, Sunderam V (2011) Adaptive, secure, and scalable distributed data outsourcing: a vision paper. In: Proceedings of workshop on dynamic distributed data-intensive applications, pp 1–6 Xiong L, Goryczka S, Sunderam V (2011) Adaptive, secure, and scalable distributed data outsourcing: a vision paper. In: Proceedings of workshop on dynamic distributed data-intensive applications, pp 1–6
22.
Zurück zum Zitat Mohamed EM, Abdelkader HS, El-Etriby S (2012) Enhanced data security model for cloud computing. In: Proceedings of 8th international conference on informatics and systems, pp 12–17 Mohamed EM, Abdelkader HS, El-Etriby S (2012) Enhanced data security model for cloud computing. In: Proceedings of 8th international conference on informatics and systems, pp 12–17
23.
Zurück zum Zitat Kaufman LM (2009) Data security in the world of cloud computing. IEEE Secur Priv 7(4):61–64CrossRef Kaufman LM (2009) Data security in the world of cloud computing. IEEE Secur Priv 7(4):61–64CrossRef
24.
Zurück zum Zitat Armbrust M, Fox A, Griffith R et al (2010) A view of cloud computing. Commun ACM 53(4):50–58CrossRef Armbrust M, Fox A, Griffith R et al (2010) A view of cloud computing. Commun ACM 53(4):50–58CrossRef
25.
Zurück zum Zitat Saritha S (2010) Google File System. Dissertation, Cochin University of Science and Technology Saritha S (2010) Google File System. Dissertation, Cochin University of Science and Technology
27.
Zurück zum Zitat Natarajan A (2013) User-oriented modeling of scientific workflows for high frequency event data analysis. In: Proceedings of the 29th IEEE international conference on data engineering workshops, pp 306–309 Natarajan A (2013) User-oriented modeling of scientific workflows for high frequency event data analysis. In: Proceedings of the 29th IEEE international conference on data engineering workshops, pp 306–309
28.
Zurück zum Zitat Guo L, He Z, Zhao S et al (2012) Multi-objective optimization for data placement strategy in cloud computing. In: Liu C, Wang L, Yang A (eds) Information computing and applications. Communications in computer and information science. Springer, Heidelberg, pp 119–126 Guo L, He Z, Zhao S et al (2012) Multi-objective optimization for data placement strategy in cloud computing. In: Liu C, Wang L, Yang A (eds) Information computing and applications. Communications in computer and information science. Springer, Heidelberg, pp 119–126
29.
Zurück zum Zitat Guo L, Zhao S, Shen S et al (2012) A particle swarm optimization for data placement strategy in cloud computing. In: Zhu R, Ma Y (eds) Information engineering and applications. Lecture notes in electrical engineering, vol 154. Springer, London, pp 946–953 Guo L, Zhao S, Shen S et al (2012) A particle swarm optimization for data placement strategy in cloud computing. In: Zhu R, Ma Y (eds) Information engineering and applications. Lecture notes in electrical engineering, vol 154. Springer, London, pp 946–953
30.
Zurück zum Zitat Ma F, Yang Y, Li T (2012) A data placement method based on Bayesian network for data-intensive scientific workflows. In: Proceedings of the international conference on computer science and service system, pp 1811–1814 Ma F, Yang Y, Li T (2012) A data placement method based on Bayesian network for data-intensive scientific workflows. In: Proceedings of the international conference on computer science and service system, pp 1811–1814
31.
Zurück zum Zitat Er-Dum Z, Yong-Qiang Q, Xing-Xing X et al (2012) A data placement strategy based on genetic algorithm for scientific workflows. In: Proceedings of the 8th international conference on computational intelligence and security, pp 146–149 Er-Dum Z, Yong-Qiang Q, Xing-Xing X et al (2012) A data placement strategy based on genetic algorithm for scientific workflows. In: Proceedings of the 8th international conference on computational intelligence and security, pp 146–149
32.
Zurück zum Zitat Liu S-W, Kong L-M, Ren K-J et al (2011) A two-step data placement and task scheduling strategy for optimizing scientific workflow performance on cloud computing platform. Chin J Comput 34(11):2121–2130CrossRef Liu S-W, Kong L-M, Ren K-J et al (2011) A two-step data placement and task scheduling strategy for optimizing scientific workflow performance on cloud computing platform. Chin J Comput 34(11):2121–2130CrossRef
33.
Zurück zum Zitat Xi R, Lin N, Chen Y et al (2011) Compression and aggregation of Bayesian estimates for data intensive computing. Knowl Inf Syst 33(1):191–212CrossRef Xi R, Lin N, Chen Y et al (2011) Compression and aggregation of Bayesian estimates for data intensive computing. Knowl Inf Syst 33(1):191–212CrossRef
34.
Zurück zum Zitat Peng Z, Guiling W, Xu X (2013) A data placement approach for workflow in cloud. J Comput Res Dev 50(3):636–647 Peng Z, Guiling W, Xu X (2013) A data placement approach for workflow in cloud. J Comput Res Dev 50(3):636–647
35.
Zurück zum Zitat Zeng P, Cui L-Z, Wang H-Y et al (2010) A data placement strategy for data-intensive applications in cloud. Chin J Comput 33(8):1472–1480CrossRef Zeng P, Cui L-Z, Wang H-Y et al (2010) A data placement strategy for data-intensive applications in cloud. Chin J Comput 33(8):1472–1480CrossRef
36.
Zurück zum Zitat Xie T, Qin X (2006) Scheduling security-critical real-time applications on clusters. IEEE Trans Comput 55(7):864–879CrossRef Xie T, Qin X (2006) Scheduling security-critical real-time applications on clusters. IEEE Trans Comput 55(7):864–879CrossRef
37.
Zurück zum Zitat Bishop M (2003) What is computer security? IEEE Secur Priv 1(1):67–69CrossRef Bishop M (2003) What is computer security? IEEE Secur Priv 1(1):67–69CrossRef
38.
Zurück zum Zitat Xie T, Qin X (2007) Performance evaluation of a new scheduling algorithm for distributed systems with security heterogeneity. J Parallel Distrib Comput 67(10):1067–1081CrossRefMATH Xie T, Qin X (2007) Performance evaluation of a new scheduling algorithm for distributed systems with security heterogeneity. J Parallel Distrib Comput 67(10):1067–1081CrossRefMATH
39.
Zurück zum Zitat Zhu X, Lu P (2009) A two-phase scheduling strategy for real-time applications with security requirements on heterogeneous clusters. Comput Electr Eng 35(6):980–993MathSciNetCrossRefMATH Zhu X, Lu P (2009) A two-phase scheduling strategy for real-time applications with security requirements on heterogeneous clusters. Comput Electr Eng 35(6):980–993MathSciNetCrossRefMATH
40.
Zurück zum Zitat Zhu X, Qin X, Qiu M (2011) QoS-aware fault-tolerant scheduling for real-time tasks on heterogeneous clusters. IEEE Trans Comput 60(6):800–812MathSciNetCrossRef Zhu X, Qin X, Qiu M (2011) QoS-aware fault-tolerant scheduling for real-time tasks on heterogeneous clusters. IEEE Trans Comput 60(6):800–812MathSciNetCrossRef
41.
Zurück zum Zitat Stutzle T, Dorigo M (2002) A short convergence proof for a class of ant colony optimization algorithms. IEEE Trans Evol Comput 6(4):358–365CrossRef Stutzle T, Dorigo M (2002) A short convergence proof for a class of ant colony optimization algorithms. IEEE Trans Evol Comput 6(4):358–365CrossRef
42.
Zurück zum Zitat Calheiros RN, Ranjan R, Beloglazov A et al (2011) CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41(1):23–50CrossRef Calheiros RN, Ranjan R, Beloglazov A et al (2011) CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41(1):23–50CrossRef
Metadaten
Titel
Security-aware intermediate data placement strategy in scientific cloud workflows
verfasst von
Wei Liu
Su Peng
Wei Du
Wei Wang
Guo Sun Zeng
Publikationsdatum
01.11.2014
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 2/2014
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-014-0755-x

Weitere Artikel der Ausgabe 2/2014

Knowledge and Information Systems 2/2014 Zur Ausgabe

Premium Partner