Skip to main content
Top
Published in: Computing 2/2023

10-10-2022 | Regular Paper

Meddal: meeting deadlines and data locality via bin packing in cloud environment

Author: Marzieh Malekimajd

Published in: Computing | Issue 2/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Cloud computing has become one of the most popular frameworks for big data processing and analysis. To have better performance, the data chunks and their corresponding process nodes must be near each other, which is known as data locality. Moreover, to meet the deadline constraints, the available processing power next to the data locations is crucial. Considering the time and resource constraints, designing an appropriate job scheduler is essential for the effective management of memory and processing capacity. These concerns coalesce into the problem of joint data placement and job scheduling with data locality and deadline constraints. In this paper, two kinds of job sets are determined, one with equal deadlines and the other one with various deadlines. The problem of joint data placement and job scheduling with a single deadline is formulated as the problem of bin packing with splittable items and cardinality constraints, which is strongly NP-hard. Moreover, this paper proposes the polynomial-time Meddal algorithm to find a feasible solution with a few processing nodes. Experiment results have shown that the proposed method is effective, with the number of required nodes reduced by \(33\%\) when compared to the algorithms Cred and First Fit.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Puthal D, Sahoo B, Mishra S, Swain S (2015) Cloud computing features, issues, and challenges: a big picture. IEEE, pp 116–123 Puthal D, Sahoo B, Mishra S, Swain S (2015) Cloud computing features, issues, and challenges: a big picture. IEEE, pp 116–123
2.
4.
go back to reference Xu M, Alamro S, Lan T, Subramaniam S (2017) CRED: cloud right-sizing with execution deadlines and data locality. IEEE Trans Parallel Distrib Syst 28(12):3389–3400CrossRef Xu M, Alamro S, Lan T, Subramaniam S (2017) CRED: cloud right-sizing with execution deadlines and data locality. IEEE Trans Parallel Distrib Syst 28(12):3389–3400CrossRef
5.
go back to reference Shi S, Wu C, Li Z (2015) Cost-minimizing online VM purchasing for application service providers with arbitrary demands. pp 146–154 Shi S, Wu C, Li Z (2015) Cost-minimizing online VM purchasing for application service providers with arbitrary demands. pp 146–154
6.
go back to reference Tarafdar A, Debnath M, Khatua S, Das RK (2021) Energy and makespan aware scheduling of deadline sensitive tasks in the cloud environment. J Grid Comput 19(2):1–25CrossRef Tarafdar A, Debnath M, Khatua S, Das RK (2021) Energy and makespan aware scheduling of deadline sensitive tasks in the cloud environment. J Grid Comput 19(2):1–25CrossRef
7.
go back to reference Singh S, Chana I (2016) A survey on resource scheduling in cloud computing: issues and challenges. J Grid Comput 14(2):217–264CrossRef Singh S, Chana I (2016) A survey on resource scheduling in cloud computing: issues and challenges. J Grid Comput 14(2):217–264CrossRef
8.
go back to reference Zhan ZH et al (2015) Cloud computing resource scheduling and a survey of its evolutionary approaches. ACM Comput Surv 47(4):63:1-63:33CrossRef Zhan ZH et al (2015) Cloud computing resource scheduling and a survey of its evolutionary approaches. ACM Comput Surv 47(4):63:1-63:33CrossRef
9.
go back to reference Magoulès F, Pan J, Teng F (2016) Cloud computing: lata-intensive computing and scheduling. Chapman and Hall/CRCCrossRef Magoulès F, Pan J, Teng F (2016) Cloud computing: lata-intensive computing and scheduling. Chapman and Hall/CRCCrossRef
10.
go back to reference Ma X, Fan X, Liu J, Jiang H, Peng K (2017) vLocality: revisiting data locality for MapReduce in virtualized clouds. IEEE Netw 31(1):28–35CrossRef Ma X, Fan X, Liu J, Jiang H, Peng K (2017) vLocality: revisiting data locality for MapReduce in virtualized clouds. IEEE Netw 31(1):28–35CrossRef
11.
go back to reference Zou X, et al (2021) The dilemma between deduplication and locality: can both be achieved? pp 171–185 Zou X, et al (2021) The dilemma between deduplication and locality: can both be achieved? pp 171–185
12.
go back to reference Shabeera T, Madhu Kumar S (2015) Optimising virtual machine allocation in MapReduce cloud for improved data locality. Int J Big Data Intell 2(1):2–8CrossRef Shabeera T, Madhu Kumar S (2015) Optimising virtual machine allocation in MapReduce cloud for improved data locality. Int J Big Data Intell 2(1):2–8CrossRef
13.
go back to reference Lee S, Jo J, Kim Y (2019) Survey of data locality in apache hadoop. IEEE, pp 46–53 Lee S, Jo J, Kim Y (2019) Survey of data locality in apache hadoop. IEEE, pp 46–53
14.
go back to reference Palanisamy B, Singh A, Liu L, Jain B, Lathrop SA, Costa J, Kramer W (2011) Purlieus: locality-aware resource allocation for MapReduce in a cloud. In: Lathrop SA, Costa J, Kramer W (eds.), Proceedings of conference on high performance computing networking, storage and analysis, ACM, Seattle, November 12–18 2011, pp 58:1–58:11 Palanisamy B, Singh A, Liu L, Jain B, Lathrop SA, Costa J, Kramer W (2011) Purlieus: locality-aware resource allocation for MapReduce in a cloud. In: Lathrop SA, Costa J, Kramer W (eds.), Proceedings of conference on high performance computing networking, storage and analysis, ACM, Seattle, November 12–18 2011, pp 58:1–58:11
15.
go back to reference Tang S, Lee B, He B (2014) DynamicMR: a dynamic slot allocation optimization framework for MapReduce clusters. IEEE Trans Cloud Comput 2(3):333–347CrossRef Tang S, Lee B, He B (2014) DynamicMR: a dynamic slot allocation optimization framework for MapReduce clusters. IEEE Trans Cloud Comput 2(3):333–347CrossRef
16.
go back to reference Naik NS, Negi A, Br TB, Anitha R (2019) A data locality based scheduler to enhance MapReduce performance in heterogeneous environments. Future Gener Comput Syst 90:423–434CrossRef Naik NS, Negi A, Br TB, Anitha R (2019) A data locality based scheduler to enhance MapReduce performance in heterogeneous environments. Future Gener Comput Syst 90:423–434CrossRef
17.
go back to reference Ru J, Yang Y, Grundy J, Keung J, Hao L (2019) A highly efficient data locality aware task scheduler for cloud-based systems. IEEE, pp 496–498 Ru J, Yang Y, Grundy J, Keung J, Hao L (2019) A highly efficient data locality aware task scheduler for cloud-based systems. IEEE, pp 496–498
18.
go back to reference Jalalian Z, Sharifi M (2021) A hierarchical multi-objective task scheduling approach for fast big data processing. J Supercomput 78:1–30 Jalalian Z, Sharifi M (2021) A hierarchical multi-objective task scheduling approach for fast big data processing. J Supercomput 78:1–30
19.
go back to reference Bittencourt L et al (2018) The internet of things, fog and cloud continuum: integration and challenges. Internet Things 3:134CrossRef Bittencourt L et al (2018) The internet of things, fog and cloud continuum: integration and challenges. Internet Things 3:134CrossRef
20.
go back to reference Li S, Lan T, Ra M-R, Panta R (2018) Joint scheduling and source selection for background traffic in erasure-coded storage. IEEE Trans Parallel Distrib Syst 29(12):2826–2837CrossRef Li S, Lan T, Ra M-R, Panta R (2018) Joint scheduling and source selection for background traffic in erasure-coded storage. IEEE Trans Parallel Distrib Syst 29(12):2826–2837CrossRef
21.
go back to reference Epstein L, van Stee R (2011) Improved results for a memory allocation problem. Theory Comput Syst 48(1):79–92CrossRefMATH Epstein L, van Stee R (2011) Improved results for a memory allocation problem. Theory Comput Syst 48(1):79–92CrossRefMATH
22.
go back to reference Garey MR, Johnson DS (1978) “Strong’’ NP-completeness results: motivation, examples, and implications. J ACM 25(3):499–508CrossRefMATH Garey MR, Johnson DS (1978) “Strong’’ NP-completeness results: motivation, examples, and implications. J ACM 25(3):499–508CrossRefMATH
23.
go back to reference Li C, Cai Q, Lou Y (2022) Optimal data placement strategy considering capacity limitation and load balancing in geographically distributed cloud. Future Gener Comput Syst 127:142–159CrossRef Li C, Cai Q, Lou Y (2022) Optimal data placement strategy considering capacity limitation and load balancing in geographically distributed cloud. Future Gener Comput Syst 127:142–159CrossRef
24.
go back to reference Ru J, Yang Y, Grundy J, Keung J, Hao L (2021) An efficient deadline constrained and data locality aware dynamic scheduling framework for multitenancy clouds. Concurr Comput Pract Exp 33(5):e6037CrossRef Ru J, Yang Y, Grundy J, Keung J, Hao L (2021) An efficient deadline constrained and data locality aware dynamic scheduling framework for multitenancy clouds. Concurr Comput Pract Exp 33(5):e6037CrossRef
25.
go back to reference Swain CK, Gupta B, Sahu A (2020) Constraint aware profit maximization scheduling of tasks in heterogeneous datacenters. Computing 102(10):2229–2255CrossRefMATH Swain CK, Gupta B, Sahu A (2020) Constraint aware profit maximization scheduling of tasks in heterogeneous datacenters. Computing 102(10):2229–2255CrossRefMATH
26.
go back to reference Li C et al (2019) Data locality optimization based on data migration and hotspots prediction in geo-distributed cloud environment. Knowl Based Syst 165:321–334CrossRef Li C et al (2019) Data locality optimization based on data migration and hotspots prediction in geo-distributed cloud environment. Knowl Based Syst 165:321–334CrossRef
27.
go back to reference Xu M, Alamro S, Lan T, Subramaniam S (2018) Chronos: a unifying optimization framework for speculative execution of deadline-critical MapReduce jobs. IEEE, pp 718–729 Xu M, Alamro S, Lan T, Subramaniam S (2018) Chronos: a unifying optimization framework for speculative execution of deadline-critical MapReduce jobs. IEEE, pp 718–729
28.
go back to reference Li C, Liu J, Wang M, Luo Y (2022) Fault-tolerant scheduling and data placement for scientific workflow processing in geo-distributed clouds. J Syst Softw 187:111227CrossRef Li C, Liu J, Wang M, Luo Y (2022) Fault-tolerant scheduling and data placement for scientific workflow processing in geo-distributed clouds. J Syst Softw 187:111227CrossRef
29.
go back to reference Kang Y, Pan L, Liu S (2022) Job scheduling for big data analytical applications in clouds: a taxonomy study. Future Gener Comput Syst 135:129CrossRef Kang Y, Pan L, Liu S (2022) Job scheduling for big data analytical applications in clouds: a taxonomy study. Future Gener Comput Syst 135:129CrossRef
30.
go back to reference Albers S, Quedenfeld J (2018) Optimal algorithms for right-sizing data centers Albers S, Quedenfeld J (2018) Optimal algorithms for right-sizing data centers
31.
go back to reference Alamro S, Xu M, Lan T, Subramaniam S (2020) Shed+: optimal dynamic speculation to meet application deadlines in cloud. IEEE Trans Netw Serv Manag 17(3):1515–1526CrossRef Alamro S, Xu M, Lan T, Subramaniam S (2020) Shed+: optimal dynamic speculation to meet application deadlines in cloud. IEEE Trans Netw Serv Manag 17(3):1515–1526CrossRef
32.
go back to reference Toosi AN, Sinnott RO, Buyya R (2018) Resource provisioning for data-intensive applications with deadline constraints on hybrid clouds using Aneka. Future Gener Comput Syst 79:765–775CrossRef Toosi AN, Sinnott RO, Buyya R (2018) Resource provisioning for data-intensive applications with deadline constraints on hybrid clouds using Aneka. Future Gener Comput Syst 79:765–775CrossRef
33.
go back to reference Yuan H et al (2017) TTSA: an effective scheduling approach for delay bounded tasks in hybrid clouds. IEEE Trans Cybern 47(11):3658–3668CrossRef Yuan H et al (2017) TTSA: an effective scheduling approach for delay bounded tasks in hybrid clouds. IEEE Trans Cybern 47(11):3658–3668CrossRef
34.
go back to reference Kaur K, Kumar N, Garg S, Rodrigues JJPC (2018) EnLoc: data locality-aware energy-efficient scheduling scheme for cloud data centers. pp 1–6 Kaur K, Kumar N, Garg S, Rodrigues JJPC (2018) EnLoc: data locality-aware energy-efficient scheduling scheme for cloud data centers. pp 1–6
35.
go back to reference Lee JH, Jang H, Kim HJ (2020) Iterative job splitting algorithms for parallel machine scheduling with job splitting and setup resource constraints. J Oper Res Soc 72:1–20 Lee JH, Jang H, Kim HJ (2020) Iterative job splitting algorithms for parallel machine scheduling with job splitting and setup resource constraints. J Oper Res Soc 72:1–20
36.
go back to reference Epstein L, Levin A, van Stee R (2012) Approximation schemes for packing splittable items with cardinality constraints. Algorithmica 62(1–2):102–129CrossRefMATH Epstein L, Levin A, van Stee R (2012) Approximation schemes for packing splittable items with cardinality constraints. Algorithmica 62(1–2):102–129CrossRefMATH
37.
go back to reference LeCun B, Mautor T, Quessette F, Weisser M-A (2015) Bin packing with fragmentable items: presentation and approximations. Theor Comput Sci 602:50–59CrossRefMATH LeCun B, Mautor T, Quessette F, Weisser M-A (2015) Bin packing with fragmentable items: presentation and approximations. Theor Comput Sci 602:50–59CrossRefMATH
38.
go back to reference Grégoire JC, Hamel AM (2014) On scheduling live media streaming in the cloud—a study. IEEE, pp 1–6 Grégoire JC, Hamel AM (2014) On scheduling live media streaming in the cloud—a study. IEEE, pp 1–6
39.
go back to reference Casazza M, Ceselli A (2016) Exactly solving packing problems with fragmentation. Comput Oper Res 75:202–213CrossRefMATH Casazza M, Ceselli A (2016) Exactly solving packing problems with fragmentation. Comput Oper Res 75:202–213CrossRefMATH
40.
go back to reference Beaumont O, Eyraud-Dubois L, Caro CT, Rejeb H (2013) Heterogeneous resource allocation under degree constraints. IEEE Trans Parallel Distrib Syst 24(5):926–937CrossRef Beaumont O, Eyraud-Dubois L, Caro CT, Rejeb H (2013) Heterogeneous resource allocation under degree constraints. IEEE Trans Parallel Distrib Syst 24(5):926–937CrossRef
41.
go back to reference Jaykrishnan G, Levin A (2022) EPTAS for the dual of splittable bin packing with cardinality constraint. arXiv preprint arXiv:2204.04685 Jaykrishnan G, Levin A (2022) EPTAS for the dual of splittable bin packing with cardinality constraint. arXiv preprint arXiv:​2204.​04685
42.
go back to reference Fleszar K (2022) A MILP model and two heuristics for the bin packing problem with conflicts and item fragmentation. Eur J Oper Res 303:37CrossRefMATH Fleszar K (2022) A MILP model and two heuristics for the bin packing problem with conflicts and item fragmentation. Eur J Oper Res 303:37CrossRefMATH
43.
go back to reference Ekici A (2022) Variable-sized bin packing problem with conflicts and item fragmentation. Comput Ind Eng 163:107844CrossRef Ekici A (2022) Variable-sized bin packing problem with conflicts and item fragmentation. Comput Ind Eng 163:107844CrossRef
44.
go back to reference Ekici A (2021) Bin packing problem with conflicts and item fragmentation. Comput Oper Res 126:105113CrossRefMATH Ekici A (2021) Bin packing problem with conflicts and item fragmentation. Comput Oper Res 126:105113CrossRefMATH
45.
go back to reference Jalaparti V et al (2015) Network-aware scheduling for data-parallel jobs: plan when you can. Comput Commun Rev 45(5):407–420CrossRef Jalaparti V et al (2015) Network-aware scheduling for data-parallel jobs: plan when you can. Comput Commun Rev 45(5):407–420CrossRef
Metadata
Title
Meddal: meeting deadlines and data locality via bin packing in cloud environment
Author
Marzieh Malekimajd
Publication date
10-10-2022
Publisher
Springer Vienna
Published in
Computing / Issue 2/2023
Print ISSN: 0010-485X
Electronic ISSN: 1436-5057
DOI
https://doi.org/10.1007/s00607-022-01122-0

Other articles of this Issue 2/2023

Computing 2/2023 Go to the issue

Premium Partner