Skip to main content

2017 | OriginalPaper | Buchkapitel

Elastic Resource Provisioning for Batched Stream Processing System in Container Cloud

verfasst von : Song Wu, Xingjun Wang, Hai Jin, Haibao Chen

Erschienen in: Web and Big Data

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Batched stream processing systems achieve higher throughput than traditional stream processing systems while providing low latency guarantee. Recently, batched stream processing systems tend to be deployed in cloud due to their requirement of elasticity and cost efficiency. However, the performance of batched stream processing systems are hardly guaranteed in cloud because static resource provisioning for such systems does not fit for stream fluctuation and uneven workload distribution. In this paper, we propose EStream: an elastic batched stream processing system based on Spark Streaming, which transparently adjusts available resource to handle workload fluctuation and uneven distribution in container cloud. Specifically, EStream can automatically scale cluster when resource insufficiency or over-provisioning is detected under the situation of workload fluctuation. On the other hand, it conducts resource scheduling in cluster according to the workload distribution. Experimental results show that EStream is able to handle workload fluctuation and uneven distribution transparently and enhance resource efficiency, compared to original Spark Streaming.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cervino, J., Kalyvianaki, E., Salvachua, J., Pietzuch, P.: Adaptive provisioning of stream processing systems in the cloud. In: Proceedings of 2012 IEEE 28th International Conference on Data Engineering Workshops (ICDEW), pp. 295–301. IEEE (2012) Cervino, J., Kalyvianaki, E., Salvachua, J., Pietzuch, P.: Adaptive provisioning of stream processing systems in the cloud. In: Proceedings of 2012 IEEE 28th International Conference on Data Engineering Workshops (ICDEW), pp. 295–301. IEEE (2012)
2.
Zurück zum Zitat Das, T., Zhong, Y., Stoica, I., Shenker, S.: Adaptive stream processing using dynamic batch sizing. In: Proceedings of the ACM Symposium on Cloud Computing (SoCC), pp. 1–13. ACM (2014) Das, T., Zhong, Y., Stoica, I., Shenker, S.: Adaptive stream processing using dynamic batch sizing. In: Proceedings of the ACM Symposium on Cloud Computing (SoCC), pp. 1–13. ACM (2014)
3.
Zurück zum Zitat Fu, T.Z., Ding, J., Ma, R.T., Winslett, M., Yang, Y., Zhang, Z.: Drs: dynamic resource scheduling for real-time analytics over fast streams. In: Proceedings of 2015 IEEE 35th International Conference on Distributed Computing Systems (ICDCS), pp. 411–420. IEEE (2015) Fu, T.Z., Ding, J., Ma, R.T., Winslett, M., Yang, Y., Zhang, Z.: Drs: dynamic resource scheduling for real-time analytics over fast streams. In: Proceedings of 2015 IEEE 35th International Conference on Distributed Computing Systems (ICDCS), pp. 411–420. IEEE (2015)
4.
Zurück zum Zitat Guo, Y., Rao, J., Jiang, C., Zhou, X.: Flexslot: moving hadoop into the cloud with flexible slot management. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pp. 959–969. IEEE (2014) Guo, Y., Rao, J., Jiang, C., Zhou, X.: Flexslot: moving hadoop into the cloud with flexible slot management. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pp. 959–969. IEEE (2014)
5.
Zurück zum Zitat Jyothi, S.A., Curino, C., Menache, I., Narayanamurthy, S.M., Tumanov, A., Yaniv, J., Goiri, Í., Krishnan, S., Kulkarni, J., Rao, S.: Morpheus: towards automated slos for enterprise clusters. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), p. 117. USENIX (2016) Jyothi, S.A., Curino, C., Menache, I., Narayanamurthy, S.M., Tumanov, A., Yaniv, J., Goiri, Í., Krishnan, S., Kulkarni, J., Rao, S.: Morpheus: towards automated slos for enterprise clusters. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), p. 117. USENIX (2016)
6.
Zurück zum Zitat Kambatla, K., Pathak, A., Pucha, H.: Towards optimizing hadoop provisioning in the cloud. In: Proceedings of USENIX Workshop on Hot Topics in Cloud Computing (HotCloud), vol. 9, p. 12. USENIX (2009) Kambatla, K., Pathak, A., Pucha, H.: Towards optimizing hadoop provisioning in the cloud. In: Proceedings of USENIX Workshop on Hot Topics in Cloud Computing (HotCloud), vol. 9, p. 12. USENIX (2009)
7.
Zurück zum Zitat Kumbhare, A., Frincu, M., Simmhan, Y., Prasanna, V.K.: Fault-tolerant and elastic streaming mapreduce with decentralized coordination. In: Proceedings of 2015 IEEE 35th International Conference on Distributed Computing Systems (ICDCS), pp. 328–338. IEEE (2015) Kumbhare, A., Frincu, M., Simmhan, Y., Prasanna, V.K.: Fault-tolerant and elastic streaming mapreduce with decentralized coordination. In: Proceedings of 2015 IEEE 35th International Conference on Distributed Computing Systems (ICDCS), pp. 328–338. IEEE (2015)
8.
Zurück zum Zitat Lin, W., Qian, Z., Xu, J., Yang, S., Zhou, J., Zhou, L.: Streamscope: continuous reliable distributed processing of big data streams. In: Proceedings of USENIX Symposium on Networked System Design and Implementation (NSDI), pp. 439–454. USENIX (2016) Lin, W., Qian, Z., Xu, J., Yang, S., Zhou, J., Zhou, L.: Streamscope: continuous reliable distributed processing of big data streams. In: Proceedings of USENIX Symposium on Networked System Design and Implementation (NSDI), pp. 439–454. USENIX (2016)
9.
Zurück zum Zitat Madsen, K.G.S., Zhou, Y.: Dynamic resource management in a massively parallel stream processing engine. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management (CIKM), pp. 13–22. ACM (2015) Madsen, K.G.S., Zhou, Y.: Dynamic resource management in a massively parallel stream processing engine. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management (CIKM), pp. 13–22. ACM (2015)
10.
Zurück zum Zitat Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: distributed stream computing platform. In: Proceedings of 2010 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 170–177. IEEE (2010) Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: distributed stream computing platform. In: Proceedings of 2010 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 170–177. IEEE (2010)
11.
Zurück zum Zitat Ousterhout, K., Rasti, R., Ratnasamy, S., Shenker, S., Chun, B.G.: Making sense of performance in data analytics frameworks. In: Proceedings of USENIX Symposium on Networked Systems Design and Implementation (NSDI), pp. 293–307. USENIX (2015) Ousterhout, K., Rasti, R., Ratnasamy, S., Shenker, S., Chun, B.G.: Making sense of performance in data analytics frameworks. In: Proceedings of USENIX Symposium on Networked Systems Design and Implementation (NSDI), pp. 293–307. USENIX (2015)
12.
Zurück zum Zitat Park, J., Lee, D., Kim, B., Huh, J., Maeng, S.: Locality-aware dynamic vm reconfiguration on mapreduce clouds. In: Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing (HPDC), pp. 27–36. ACM (2012) Park, J., Lee, D., Kim, B., Huh, J., Maeng, S.: Locality-aware dynamic vm reconfiguration on mapreduce clouds. In: Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing (HPDC), pp. 27–36. ACM (2012)
13.
Zurück zum Zitat Rasley, J., Karanasos, K., Kandula, S., Fonseca, R., Vojnovic, M., Rao, S.: Efficient queue management for cluster scheduling. In: Proceedings of the 11th European Conference on Computer Systems (EuroSys), p. 36. ACM (2016) Rasley, J., Karanasos, K., Kandula, S., Fonseca, R., Vojnovic, M., Rao, S.: Efficient queue management for cluster scheduling. In: Proceedings of the 11th European Conference on Computer Systems (EuroSys), p. 36. ACM (2016)
14.
Zurück zum Zitat Ruan, J., Zheng, Q., Dong, B.: Optimal resource provisioning approach based on cost modeling for spark applications in public clouds. In: Proceedings of the Doctoral Symposium of the 16th International Middleware Conference, p. 6. ACM (2015) Ruan, J., Zheng, Q., Dong, B.: Optimal resource provisioning approach based on cost modeling for spark applications in public clouds. In: Proceedings of the Doctoral Symposium of the 16th International Middleware Conference, p. 6. ACM (2015)
15.
Zurück zum Zitat Toshniwal, A., Taneja, S., Shukla, A., Ramasamy, K., Patel, J.M., Kulkarni, S., Jackson, J., Gade, K., Fu, M., Donham, J., Bhagat, N., Mittal, S., Ryaboy, D.: Storm@ twitter. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 147–156. ACM (2014) Toshniwal, A., Taneja, S., Shukla, A., Ramasamy, K., Patel, J.M., Kulkarni, S., Jackson, J., Gade, K., Fu, M., Donham, J., Bhagat, N., Mittal, S., Ryaboy, D.: Storm@ twitter. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 147–156. ACM (2014)
16.
Zurück zum Zitat Wu, Y., Tan, K.L.: Chronostream: elastic stateful stream computation in the cloud. In: Proceedings of 2015 IEEE 31st International Conference on Data Engineering (ICDE), pp. 723–734. IEEE (2015) Wu, Y., Tan, K.L.: Chronostream: elastic stateful stream computation in the cloud. In: Proceedings of 2015 IEEE 31st International Conference on Data Engineering (ICDE), pp. 723–734. IEEE (2015)
17.
Zurück zum Zitat Xing, Y., Zdonik, S., Hwang, J.H.: Dynamic load distribution in the borealis stream processor. In: Proceedings of 2005 21st International Conference on Data Engineering (ICDE), pp. 791–802. IEEE (2005) Xing, Y., Zdonik, S., Hwang, J.H.: Dynamic load distribution in the borealis stream processor. In: Proceedings of 2005 21st International Conference on Data Engineering (ICDE), pp. 791–802. IEEE (2005)
18.
Zurück zum Zitat Xu, L., Peng, B., Gupta, I.: Stela: enabling stream processing systems to scale-in and scale-out on-demand. In: Proceedings of IEEE International Conference on Cloud Engineering (IC2E), pp. 22–31. IEEE (2016) Xu, L., Peng, B., Gupta, I.: Stela: enabling stream processing systems to scale-in and scale-out on-demand. In: Proceedings of IEEE International Conference on Cloud Engineering (IC2E), pp. 22–31. IEEE (2016)
19.
Zurück zum Zitat Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of USENIX Symposium on Networked Systems Design and Implementation (NSDI), p. 2. USENIX (2012) Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of USENIX Symposium on Networked Systems Design and Implementation (NSDI), p. 2. USENIX (2012)
20.
Zurück zum Zitat Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., Stoica, I.: Discretized streams: fault-tolerant streaming computation at scale. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP), pp. 423–438. ACM (2013) Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., Stoica, I.: Discretized streams: fault-tolerant streaming computation at scale. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP), pp. 423–438. ACM (2013)
Metadaten
Titel
Elastic Resource Provisioning for Batched Stream Processing System in Container Cloud
verfasst von
Song Wu
Xingjun Wang
Hai Jin
Haibao Chen
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-63579-8_32