Skip to main content
Erschienen in: The Journal of Supercomputing 7/2018

31.03.2018

Improving the energy efficiency and performance of data-intensive workflows in virtualized clouds

verfasst von: Xilong Qu, Peng Xiao, Lirong Huang

Erschienen in: The Journal of Supercomputing | Ausgabe 7/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In recent years, deploying and running data-intensive workflows in cloud platform has become more and more popular in many areas. Unlike computation-intensive applications, a data-intensive workflow typically requires to deal with bulk data transferring between different resource sites, which means some traditional energy-efficiency optimization technologies are difficult to be enforced when running data-intensive workflows. In this paper, we first formulate the power model of a data-intensive workflow, which takes into account power consumption caused by data transferring. Based on this power model, we introduce a novel metric called Shortest Path in terms of Energy Consumption and design an energy-efficient heuristic scheduling algorithm, which is aiming at reducing the extra energy consumption caused by delays of bulk data transferring. Extensive experiments and performance evaluations show that the proposed scheduling algorithm can significantly reduce the overall energy consumption of running data-intensive workflows comparing with several existing algorithms. In addition, the proposed algorithm also exhibits better adaptiveness and robustness when a cloud system is facing intensive and unpredicted workloads.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Buyya R, Yeo CS, Venugopal S et al (2009) Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener. Comput. Syst. 25(6):599–616CrossRef Buyya R, Yeo CS, Venugopal S et al (2009) Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener. Comput. Syst. 25(6):599–616CrossRef
2.
Zurück zum Zitat Murphy MA, Goasguen S (2010) Virtual organization clusters: self-provisioned clouds on the grid. Future Gener Comput Syst 26(8):271–1281CrossRef Murphy MA, Goasguen S (2010) Virtual organization clusters: self-provisioned clouds on the grid. Future Gener Comput Syst 26(8):271–1281CrossRef
3.
Zurück zum Zitat Hosny AM, Shedeed HA, Hussein AS, Tolba MF (2014) Cloud-based parallel solution for estimating statistical significance of megabyte-scale DNA sequences. Concurr Comput Pract Exp 26(1):118–133CrossRef Hosny AM, Shedeed HA, Hussein AS, Tolba MF (2014) Cloud-based parallel solution for estimating statistical significance of megabyte-scale DNA sequences. Concurr Comput Pract Exp 26(1):118–133CrossRef
4.
Zurück zum Zitat Kim C, Jeon C, Lee W, Yang S (2015) A parallel migration scheme for fast virtual machine relocation on a cloud cluster. J Supercomput 71(12):4623–4645CrossRef Kim C, Jeon C, Lee W, Yang S (2015) A parallel migration scheme for fast virtual machine relocation on a cloud cluster. J Supercomput 71(12):4623–4645CrossRef
5.
Zurück zum Zitat Szabo C, Sheng QZ, Kroeger T et al (2014) Science in the cloud: allocation and execution of data-Intensive scientific workflows. J Grid Comput 12(2):245–264CrossRef Szabo C, Sheng QZ, Kroeger T et al (2014) Science in the cloud: allocation and execution of data-Intensive scientific workflows. J Grid Comput 12(2):245–264CrossRef
7.
Zurück zum Zitat Bugnion E, Devine S, Rosenblum M et al (2012) Bringing virtualization to the x86 architecture with the original VMware Workstation. ACM Trans Computer Syst 30(4):1–51CrossRef Bugnion E, Devine S, Rosenblum M et al (2012) Bringing virtualization to the x86 architecture with the original VMware Workstation. ACM Trans Computer Syst 30(4):1–51CrossRef
8.
Zurück zum Zitat Gomez-Folgar F, Garcia-Loureiro AJ, Pena TF et al (2015) Study of the KVM CPU performance of open-source cloud management platforms. In: Proceedings of IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). IEEE, Shenzhen, China, pp 1225–1228. https://doi.org/10.1109/CCGrid.2015.103 Gomez-Folgar F, Garcia-Loureiro AJ, Pena TF et al (2015) Study of the KVM CPU performance of open-source cloud management platforms. In: Proceedings of IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). IEEE, Shenzhen, China, pp 1225–1228. https://​doi.​org/​10.​1109/​CCGrid.​2015.​103
9.
Zurück zum Zitat Govindan S, Choi J, Nath AR et al (2009) Xen and Co.: communication-aware CPU management in consolidated xen-based hosting platforms. IEEE Trans Comput 58(8):1111–1125MathSciNetCrossRef Govindan S, Choi J, Nath AR et al (2009) Xen and Co.: communication-aware CPU management in consolidated xen-based hosting platforms. IEEE Trans Comput 58(8):1111–1125MathSciNetCrossRef
10.
Zurück zum Zitat Sharifi M, Salimi H, Najafzadeh M (2012) Power-efficient distributed scheduling of virtual machines using workload-aware consolidation techniques. J Supercomput 61(1):6–66CrossRef Sharifi M, Salimi H, Najafzadeh M (2012) Power-efficient distributed scheduling of virtual machines using workload-aware consolidation techniques. J Supercomput 61(1):6–66CrossRef
12.
Zurück zum Zitat Wang J, Feng L (2011) A survey on energy-efficient data management. ACM SIGMOD Rec 40(2):17–23CrossRef Wang J, Feng L (2011) A survey on energy-efficient data management. ACM SIGMOD Rec 40(2):17–23CrossRef
13.
Zurück zum Zitat Van Heddeghem W, Vereecken W, Colle D et al (2012) Distributed computing for carbon footprint reduction by exploiting low-footprint energy availability. Future Gener Comput Syst 28(2):405–414CrossRef Van Heddeghem W, Vereecken W, Colle D et al (2012) Distributed computing for carbon footprint reduction by exploiting low-footprint energy availability. Future Gener Comput Syst 28(2):405–414CrossRef
14.
Zurück zum Zitat Arabnia HR, Oliver MA (1986) Fast operations on raster images with SIMD machine architectures. Int J Eurograph Assoc Comput Graph Forum 5(3):179–188CrossRef Arabnia HR, Oliver MA (1986) Fast operations on raster images with SIMD machine architectures. Int J Eurograph Assoc Comput Graph Forum 5(3):179–188CrossRef
15.
Zurück zum Zitat Arabnia HR, Taha TR (1986) A parallel numerical algorithm on a reconfigurable multi-ring network. J Telecommun Syst 10(1–2):185–203 1998 Arabnia HR, Taha TR (1986) A parallel numerical algorithm on a reconfigurable multi-ring network. J Telecommun Syst 10(1–2):185–203 1998
16.
Zurück zum Zitat Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multi-ring network. J Supercomput 25(1):43–63CrossRefMATH Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multi-ring network. J Supercomput 25(1):43–63CrossRefMATH
17.
Zurück zum Zitat Valafar H, Arabnia HR, Williams G (2004) Distributed global optimization and its development on the multiring network. Int J Neural Parallel Sci Comput 12(4):465–490MathSciNetMATH Valafar H, Arabnia HR, Williams G (2004) Distributed global optimization and its development on the multiring network. Int J Neural Parallel Sci Comput 12(4):465–490MathSciNetMATH
18.
Zurück zum Zitat Arabnia HR, Smith JW (1993) A reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of the 7th Annual International High Performance Computing Conference, pp 349–357 Arabnia HR, Smith JW (1993) A reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of the 7th Annual International High Performance Computing Conference, pp 349–357
19.
Zurück zum Zitat Wani MA, Arabnia HR (2006) Parallel polygon approximation targeted at reconfigurable multi-ring hardware. In: Proceedings of the 2006 International Conference on Computer Graphics and Virtual Reality, pp 86–94 Wani MA, Arabnia HR (2006) Parallel polygon approximation targeted at reconfigurable multi-ring hardware. In: Proceedings of the 2006 International Conference on Computer Graphics and Virtual Reality, pp 86–94
20.
Zurück zum Zitat Gao PX, Curtis AR, Wang B et al (2012) It’s not easy being green. In: Proceedings of the ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM). ACM, Helsinki, pp 211–222. https://doi.org/10.1145/2342356.2342398 Gao PX, Curtis AR, Wang B et al (2012) It’s not easy being green. In: Proceedings of the ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM). ACM, Helsinki, pp 211–222. https://​doi.​org/​10.​1145/​2342356.​2342398
21.
Zurück zum Zitat Kaur PD, Chana I (2014) A resource elasticity framework for QoS-aware execution of cloud applications. Future Gener Comput Syst 37:14–25CrossRef Kaur PD, Chana I (2014) A resource elasticity framework for QoS-aware execution of cloud applications. Future Gener Comput Syst 37:14–25CrossRef
22.
Zurück zum Zitat Shibata T, Choi SJ, Taura K (2010) File-access characteristics of data-intensive workflow applications. In: Proceedings of International Conference on Cluster, Cloud and Grid Computing (CCGRID). IEEE, Melbourne, pp 522–525 Shibata T, Choi SJ, Taura K (2010) File-access characteristics of data-intensive workflow applications. In: Proceedings of International Conference on Cluster, Cloud and Grid Computing (CCGRID). IEEE, Melbourne, pp 522–525
23.
Zurück zum Zitat He L, Zou D, Zhang Z et al (2014) Developing resource consolidation frameworks for moldable virtual machines in clouds. Future Gener Comput Syst 32(1):69–81CrossRef He L, Zou D, Zhang Z et al (2014) Developing resource consolidation frameworks for moldable virtual machines in clouds. Future Gener Comput Syst 32(1):69–81CrossRef
24.
25.
Zurück zum Zitat Deelman E, Singh G, Su MH et al (2005) Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci Program J 13:219–237 Deelman E, Singh G, Su MH et al (2005) Pegasus: a framework for mapping complex scientific workflows onto distributed systems. Sci Program J 13:219–237
26.
Zurück zum Zitat Frey J, Tannenbaum T, Foster I et al (2002) Condor-G: a computation management agent for multi-institutional grids. Clust Comput 5(3):237–246CrossRef Frey J, Tannenbaum T, Foster I et al (2002) Condor-G: a computation management agent for multi-institutional grids. Clust Comput 5(3):237–246CrossRef
28.
Zurück zum Zitat Nadeem F, Fahringer T (2009) Using templates to predict execution time of scientific workflow applications in the grid. In: Proceedings of International Symposium on Cluster Computing and the Grid (CCGRID). IEEE, Shanghai, pp 316–323. https://doi.org/10.1109/CCGRID.2009.77 Nadeem F, Fahringer T (2009) Using templates to predict execution time of scientific workflow applications in the grid. In: Proceedings of International Symposium on Cluster Computing and the Grid (CCGRID). IEEE, Shanghai, pp 316–323. https://​doi.​org/​10.​1109/​CCGRID.​2009.​77
30.
Zurück zum Zitat Tolosana-Calasanza R, Banares JA, Congduc P, Rana OF (2012) Enforcing QoS in scientific workflow systems enacted over Cloud infrastructures. J Comput Syst Sci 78(5):1300–1315CrossRef Tolosana-Calasanza R, Banares JA, Congduc P, Rana OF (2012) Enforcing QoS in scientific workflow systems enacted over Cloud infrastructures. J Comput Syst Sci 78(5):1300–1315CrossRef
31.
Zurück zum Zitat Emeakaroha VC, Maurer M, Stern P et al (2013) Managing and optimizing bioinformatics workflows for data analysis in clouds. J Grid Comput 11(3):407–428CrossRef Emeakaroha VC, Maurer M, Stern P et al (2013) Managing and optimizing bioinformatics workflows for data analysis in clouds. J Grid Comput 11(3):407–428CrossRef
32.
Zurück zum Zitat Javadi B, Tomko M, Sinnott RO (2013) Decentralized orchestration of data-centric workflows in cloud environments. Future Gener Comput Syst 29(7):1826–1837CrossRef Javadi B, Tomko M, Sinnott RO (2013) Decentralized orchestration of data-centric workflows in cloud environments. Future Gener Comput Syst 29(7):1826–1837CrossRef
33.
Zurück zum Zitat Jung IY, Han BJ, Jeong CS, Rho S (2014) Cloud-based mapreduce workflow execution platform. J Internet Technol 15(6):1059–1067 Jung IY, Han BJ, Jeong CS, Rho S (2014) Cloud-based mapreduce workflow execution platform. J Internet Technol 15(6):1059–1067
34.
Zurück zum Zitat Topcuoglu H, Hariri S, Wu MY (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274CrossRef Topcuoglu H, Hariri S, Wu MY (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274CrossRef
35.
Zurück zum Zitat Decker J, Schneider J (2007) Heuristic scheduling of grid workflows supporting co-allocation and advance reservation. In: Proceedings of International Symposium on Cluster Computing and the Grid (CCGRID). IEEE, Rio de Janeiro, pp 335–342. https://doi.org/10.1109/CCGRID.2007.56 Decker J, Schneider J (2007) Heuristic scheduling of grid workflows supporting co-allocation and advance reservation. In: Proceedings of International Symposium on Cluster Computing and the Grid (CCGRID). IEEE, Rio de Janeiro, pp 335–342. https://​doi.​org/​10.​1109/​CCGRID.​2007.​56
36.
37.
40.
41.
Zurück zum Zitat Lee YC, Subrata R, Zomaya AY (2009) On the performance of a dual-objective optimization model for workflow applications on grid platforms. IEEE Trans Parallel Distrib Syst 20(9):1273–1284CrossRef Lee YC, Subrata R, Zomaya AY (2009) On the performance of a dual-objective optimization model for workflow applications on grid platforms. IEEE Trans Parallel Distrib Syst 20(9):1273–1284CrossRef
42.
Zurück zum Zitat Liu X, Chen J, Wu Z et al (2010) Handling recoverable temporal violations in scientific workflow systems: a workflow rescheduling based strategy. In: Proceedings of International Conference on Cluster, Cloud and Grid Computing (CCGRID). IEEE, Melbourne, pp 534–537. https://doi.org/10.1109/CCGRID.2010.15 Liu X, Chen J, Wu Z et al (2010) Handling recoverable temporal violations in scientific workflow systems: a workflow rescheduling based strategy. In: Proceedings of International Conference on Cluster, Cloud and Grid Computing (CCGRID). IEEE, Melbourne, pp 534–537. https://​doi.​org/​10.​1109/​CCGRID.​2010.​15
43.
Zurück zum Zitat Yuan D, Yang Y, Liu X, Chen J (2010) A data placement strategy in scientific cloud workflows. Future Gener Comput Syst 26(8):1200–1214CrossRef Yuan D, Yang Y, Liu X, Chen J (2010) A data placement strategy in scientific cloud workflows. Future Gener Comput Syst 26(8):1200–1214CrossRef
44.
Zurück zum Zitat Deng K, Ren K, Song J et al (2013) A clustering based coscheduling strategy for efficient scientific workflow execution in cloud computing. Concurr Comput Pract Exp 25(18):2523–2539CrossRef Deng K, Ren K, Song J et al (2013) A clustering based coscheduling strategy for efficient scientific workflow execution in cloud computing. Concurr Comput Pract Exp 25(18):2523–2539CrossRef
45.
Zurück zum Zitat Calheiros RN, Buyya R (2014) Meeting deadlines of scientific workflows in public clouds with tasks replication. IEEE Trans Parallel Distrib Syst 25(7):1787–1796CrossRef Calheiros RN, Buyya R (2014) Meeting deadlines of scientific workflows in public clouds with tasks replication. IEEE Trans Parallel Distrib Syst 25(7):1787–1796CrossRef
46.
Zurück zum Zitat Verma A, Kaushal S (2015) Cost-time efficient scheduling plan for executing workflows in the cloud. J Grid Comput 13(4):495–506MathSciNetCrossRef Verma A, Kaushal S (2015) Cost-time efficient scheduling plan for executing workflows in the cloud. J Grid Comput 13(4):495–506MathSciNetCrossRef
47.
Zurück zum Zitat Zeng LB, Veeravalli B, Li X (2015) SABA: a security-aware and budget-aware workflow scheduling strategy in clouds. J Parallel Distrib Comput 75:141–151CrossRef Zeng LB, Veeravalli B, Li X (2015) SABA: a security-aware and budget-aware workflow scheduling strategy in clouds. J Parallel Distrib Comput 75:141–151CrossRef
48.
Zurück zum Zitat Bryk P, Malawski M, Juve G, Deelman E (2016) Storage-aware algorithms for scheduling of workflow ensembles in clouds. J Grid Comput 14(2):359–378CrossRef Bryk P, Malawski M, Juve G, Deelman E (2016) Storage-aware algorithms for scheduling of workflow ensembles in clouds. J Grid Comput 14(2):359–378CrossRef
49.
Zurück zum Zitat Calheiros RN, Ranjan R, Beloglazov A et al (2011) CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41(1):23–50CrossRef Calheiros RN, Ranjan R, Beloglazov A et al (2011) CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41(1):23–50CrossRef
50.
Zurück zum Zitat Theiner D, Wieczorek M (2006) Reduction of calibration time of distributed hydrological models by use of grid computing and nonlinear optimisation algorithms. In: Proceedings of International Conference on Hydroinformatics, pp 1–8 Theiner D, Wieczorek M (2006) Reduction of calibration time of distributed hydrological models by use of grid computing and nonlinear optimisation algorithms. In: Proceedings of International Conference on Hydroinformatics, pp 1–8
Metadaten
Titel
Improving the energy efficiency and performance of data-intensive workflows in virtualized clouds
verfasst von
Xilong Qu
Peng Xiao
Lirong Huang
Publikationsdatum
31.03.2018
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 7/2018
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-018-2344-3

Weitere Artikel der Ausgabe 7/2018

The Journal of Supercomputing 7/2018 Zur Ausgabe