Skip to main content
Erschienen in: Cluster Computing 6/2019

26.02.2018

Towards optimal resource provisioning for Hadoop-MapReduce jobs using scale-out strategy and its performance analysis in private cloud environment

verfasst von: Ramakrishnan Ramanathan, B. Latha

Erschienen in: Cluster Computing | Sonderheft 6/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Cloud computing always provides IT resources on demand basis, without additional waiting time. Therefore, data analytics is one of the most significant areas that can be benefited from Cloud Computing. MapReduce programs in the cloud computing to optimize the resource provisioning and finish the MapReduce jobs with quantified time. The efficacy as well as the accuracy of performance of the performance model based on regression used for predicting the MapReduce job completion time has been suggested in our OpenStack private cloud Hadoop cluster using linear regression method. In order to satisfy the user jobs with deadline requirements, Cloud service providers do not have a resource provisioning technique or polices. The contemporary system requires a cloud user to estimate the needed quantity of resources for running jobs in the cloud. Our proposed scalability strategy of Scale-Out methods used to obtain the accurate prediction of job completion times through our experimental results shows the performance level of MapReduce benchmark in the open stack private cloud. The regression based performance model predicting and evaluating the execution time of 5 popular MapReduce benchmark applications over our private cloud environment with better resource utilization which depicts 99% of accuracy results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Khan, M., Jin, Y., Li, M., Xiang, Y., Jiang, C.: Hadoop performance modeling for job estimation and resource provisioning. IEEE Trans. Parallel Distrib. Syst. 27(2), 441–454 (2016)CrossRef Khan, M., Jin, Y., Li, M., Xiang, Y., Jiang, C.: Hadoop performance modeling for job estimation and resource provisioning. IEEE Trans. Parallel Distrib. Syst. 27(2), 441–454 (2016)CrossRef
2.
Zurück zum Zitat Lin, X., Meng, Z., Xu, C., Wang, M.: A practical performance model for Hadoop MapReduce. In: Proceedings of IEEE International Conference on Cluster Computing. Workshops, pp. 231–239 (2012) Lin, X., Meng, Z., Xu, C., Wang, M.: A practical performance model for Hadoop MapReduce. In: Proceedings of IEEE International Conference on Cluster Computing. Workshops, pp. 231–239 (2012)
3.
Zurück zum Zitat Cui, X., Lin, X., Hu, C., Zhang, R., Wang, C.: Modeling the performance of MapReduce under resource contentions and task failures. In: Proceedings of IEEE 5th International Conference on Cloud Computing Technology and Science, vol. 1, pp. 158–163 (2013) Cui, X., Lin, X., Hu, C., Zhang, R., Wang, C.: Modeling the performance of MapReduce under resource contentions and task failures. In: Proceedings of IEEE 5th International Conference on Cloud Computing Technology and Science, vol. 1, pp. 158–163 (2013)
4.
Zurück zum Zitat Liu, J., Zhang, Y., Zhou, Y., Zhang, D., Liu, H.: Aggressive resource provisioning for ensuring QoS in virtualized environments. IEEE Trans. Cloud Comput. 3(2), 119–131 (2015)CrossRef Liu, J., Zhang, Y., Zhou, Y., Zhang, D., Liu, H.: Aggressive resource provisioning for ensuring QoS in virtualized environments. IEEE Trans. Cloud Comput. 3(2), 119–131 (2015)CrossRef
5.
Zurück zum Zitat Mao, M., Humphrey, M.: Auto-scaling to minimize cost and meet application deadlines in cloud workflows. In: IEEE Xplore SC ‘11, Proceedings of 2011 International Conference for High Performance Computing, Networking Storage and Analysis, p. 49 (2011) Mao, M., Humphrey, M.: Auto-scaling to minimize cost and meet application deadlines in cloud workflows. In: IEEE Xplore SC ‘11, Proceedings of 2011 International Conference for High Performance Computing, Networking Storage and Analysis, p. 49 (2011)
6.
Zurück zum Zitat Zhang, Q., Cherkasova, L., Smimi, E.: Regression-based analytic model for dynamic resource provisioning of multi-tier applications. In: Proceedings of the Fourth International conference on Autonomic Computing, Jacksonville, Florida, USA (2007) Zhang, Q., Cherkasova, L., Smimi, E.: Regression-based analytic model for dynamic resource provisioning of multi-tier applications. In: Proceedings of the Fourth International conference on Autonomic Computing, Jacksonville, Florida, USA (2007)
7.
Zurück zum Zitat Davis, I.J., Hemmati, H., Holt, R.C., Godfrey, M.W., Neuse D.M., Mankovskii, S.: Regression-based utilization prediction algorithms: an empirical investigation. CASCON’13 Proceedings of the 2013, ACM, (2013) Davis, I.J., Hemmati, H., Holt, R.C., Godfrey, M.W., Neuse D.M., Mankovskii, S.: Regression-based utilization prediction algorithms: an empirical investigation. CASCON’13 Proceedings of the 2013, ACM, (2013)
8.
Zurück zum Zitat Marshall, P., Keahey, K., Freeman,T.: Elastic site using clouds to elastically extend site resources. InCluster, Cloud and Grid Computing (CCGrid), 2010 10th IEEE/ACM International Conference on IEEE, pp. 43–52 (2010) Marshall, P., Keahey, K., Freeman,T.: Elastic site using clouds to elastically extend site resources. InCluster, Cloud and Grid Computing (CCGrid), 2010 10th IEEE/ACM International Conference on IEEE, pp. 43–52 (2010)
9.
Zurück zum Zitat Hwang, K., Bai, X., Shi, Y., Li, M., Chen, W.G., Wu, Y.: Cloud performances modeling with benchmark evaluation of elastic scaling strategies. IEEE Trans. Parallel Distrib. Syst. 27(1), 130–143 (2016)CrossRef Hwang, K., Bai, X., Shi, Y., Li, M., Chen, W.G., Wu, Y.: Cloud performances modeling with benchmark evaluation of elastic scaling strategies. IEEE Trans. Parallel Distrib. Syst. 27(1), 130–143 (2016)CrossRef
10.
Zurück zum Zitat Ostermann, S., Iosup, A., Yigitbasi, N., Prodan, R., Fahringer, T., Epema, D.: A performance analysis of EC2 cloud computing services for scientific computing, International Conference on Cloud Computing, vol. 34, pp. 115–131. Springer, New York (2009)CrossRef Ostermann, S., Iosup, A., Yigitbasi, N., Prodan, R., Fahringer, T., Epema, D.: A performance analysis of EC2 cloud computing services for scientific computing, International Conference on Cloud Computing, vol. 34, pp. 115–131. Springer, New York (2009)CrossRef
11.
Zurück zum Zitat Chen, K., Powers, J., Guo, S., Tian, F.: CRESP: towards optimal resource provisioning for MapReduce computing in public clouds. IEEE Trans. Parallel Distrib. Syst. 25(6), 1403–1412 (2014)CrossRef Chen, K., Powers, J., Guo, S., Tian, F.: CRESP: towards optimal resource provisioning for MapReduce computing in public clouds. IEEE Trans. Parallel Distrib. Syst. 25(6), 1403–1412 (2014)CrossRef
12.
Zurück zum Zitat Li, D., Chen, C., Guan, J., Zhang, Y., Zhu, J., Yu, R.: DCloud: deadline-aware resource allocation for cloud computing jobs. IEEE Trans. Parallel Distrib. Syst. 27(8), 2248–2260 (2016)CrossRef Li, D., Chen, C., Guan, J., Zhang, Y., Zhu, J., Yu, R.: DCloud: deadline-aware resource allocation for cloud computing jobs. IEEE Trans. Parallel Distrib. Syst. 27(8), 2248–2260 (2016)CrossRef
13.
Zurück zum Zitat da Rosa Right, R., Rodrigues, V.F., Da Costa, C.A., Galante, G., de Bona, L.C.E., Ferreto, T.: AutoElastic:automatic resource elasticity for high performance applications in the cloud. IEEE Trans. Cloud Comput. 4(1), 16–19 (2016) da Rosa Right, R., Rodrigues, V.F., Da Costa, C.A., Galante, G., de Bona, L.C.E., Ferreto, T.: AutoElastic:automatic resource elasticity for high performance applications in the cloud. IEEE Trans. Cloud Comput. 4(1), 16–19 (2016)
14.
Zurück zum Zitat Rodriguez, M.A., Buyya, R.: Deadline based resource provisioning and scheduling algorithm for scientific workflows on cloud. IEEE Trans. Cloud Comput. 2(2), 222–235 (2014)CrossRef Rodriguez, M.A., Buyya, R.: Deadline based resource provisioning and scheduling algorithm for scientific workflows on cloud. IEEE Trans. Cloud Comput. 2(2), 222–235 (2014)CrossRef
15.
Zurück zum Zitat Mashayekhy, L., Nejad, M.M., Grosu, D.: A PTAS mechanism for provisioning and allocation of heterogeneous cloud resources. IEEE Trans. Parallel Distrib. Syst. 26(9), 2386–2399 (2015)CrossRef Mashayekhy, L., Nejad, M.M., Grosu, D.: A PTAS mechanism for provisioning and allocation of heterogeneous cloud resources. IEEE Trans. Parallel Distrib. Syst. 26(9), 2386–2399 (2015)CrossRef
16.
Zurück zum Zitat Dai, W., Bassiouni, M.: An improved task assignment scheme for Hadoop running in the clouds. J. Cloud Comput. 2(1), 23 (2013)CrossRef Dai, W., Bassiouni, M.: An improved task assignment scheme for Hadoop running in the clouds. J. Cloud Comput. 2(1), 23 (2013)CrossRef
17.
Zurück zum Zitat Pastorelli, M., Carra, D., Dell Amico, M., Michiardi, P.: HFSP: bringing size-based scheduling to hadoop. IEEE Trans. Cloud Comput. 5(1), 43–56 (2013)CrossRef Pastorelli, M., Carra, D., Dell Amico, M., Michiardi, P.: HFSP: bringing size-based scheduling to hadoop. IEEE Trans. Cloud Comput. 5(1), 43–56 (2013)CrossRef
18.
Zurück zum Zitat Calheiros, R.N., Masoumi, E., Ranjan, R., Buyya, R.: Workload prediction using ARIMA model and its impact on cloud applications’ QoS. IEEE Trans. Cloud Comput. 3(4), 449–458 (2015)CrossRef Calheiros, R.N., Masoumi, E., Ranjan, R., Buyya, R.: Workload prediction using ARIMA model and its impact on cloud applications’ QoS. IEEE Trans. Cloud Comput. 3(4), 449–458 (2015)CrossRef
19.
Zurück zum Zitat Ji, C., Li, Y., Qiu, W., Awada, U., Li, K: Big data processing in cloud computing environments. In: International Symposium Pervasive Systems, Algorithms and Networks, pp. 17–23 (2012) Ji, C., Li, Y., Qiu, W., Awada, U., Li, K: Big data processing in cloud computing environments. In: International Symposium Pervasive Systems, Algorithms and Networks, pp. 17–23 (2012)
20.
Zurück zum Zitat Zhang, Z., Cherkasova, L., Loo, B.T.: Performance modeling of MapReduce jobs in heterogeneous cloud environments. In: IEEE Sixth international Conference on Cloud Computing (2013) Zhang, Z., Cherkasova, L., Loo, B.T.: Performance modeling of MapReduce jobs in heterogeneous cloud environments. In: IEEE Sixth international Conference on Cloud Computing (2013)
21.
Zurück zum Zitat Assuncao, M.D., Calheiros, R.N., Bianchi, S., Netto, M.A., Buyya, R.: Big data computing and clouds: trends and future directions. J. Parallel Distrib. Comput. 79, 3–15 (2015)CrossRef Assuncao, M.D., Calheiros, R.N., Bianchi, S., Netto, M.A., Buyya, R.: Big data computing and clouds: trends and future directions. J. Parallel Distrib. Comput. 79, 3–15 (2015)CrossRef
Metadaten
Titel
Towards optimal resource provisioning for Hadoop-MapReduce jobs using scale-out strategy and its performance analysis in private cloud environment
verfasst von
Ramakrishnan Ramanathan
B. Latha
Publikationsdatum
26.02.2018
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe Sonderheft 6/2019
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-018-2234-8

Weitere Artikel der Sonderheft 6/2019

Cluster Computing 6/2019 Zur Ausgabe