Skip to main content

2018 | OriginalPaper | Buchkapitel

Scalability and State: A Critical Assessment of Throughput Obtainable on Big Data Streaming Frameworks for Applications With and Without State Information

verfasst von : Shinhyung Yang, Yonguk Jeong, ChangWan Hong, Hyunje Jun, Bernd Burgstaller

Erschienen in: Euro-Par 2017: Parallel Processing Workshops

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Emerging Big Data streaming applications are facing unbounded (infinite) data sets at a scale of millions of events per second. The information captured in a single event, e.g., GPS position information of mobile phone users, loses value (perishes) over time and requires sub-second latency responses. Conventional Cloud-based batch-processing platforms are inadequate to meet these constraints.
Existing streaming engines exhibit low throughput and are thus equally ill-suited for emerging Big Data streaming applications. To validate this claim, we evaluated the Yahoo streaming benchmark and our own real-time trend detector on three state-of-the-art streaming engines: Apache Storm, Apache Flink and Spark Streaming. We adapted the Kieker dynamic profiling framework to gather accurate profiling information on the throughput and CPU utilization exhibited by the two benchmarks on the Google Compute Engine.
To estimate the performance overhead incurred by current streaming engines, we re-implemented our Java-based trend detector as a multi-threaded, shared-memory application in https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-75178-8_12/453864_1_En_12_IEq1_HTML.gif . The achieved throughput of 3.2 million events per second on a stand-alone 2 CPU (44 cores) Intel Xeon E5-2699 v4 server is 44 times higher than the maximum throughput achieved with the Apache Storm version of the trend detector deployed on 30 virtual machines (nodes) in the Cloud. Our experiment suggests vertical scaling as a viable alternative to horizontal scaling, especially if shared state has to be maintained in a streaming application. For reproducibility, we have open-sourced our framework configurations on GitHub [1].

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Chintapalli, S., Dagit, D., Evans, B., Farivar, R., Graves, T., Holderbaugh, M., Liu, Z., Nusbaum, K., Patil, K., Peng, B.J., Poulos, P.: Benchmarking streaming computation engines: Storm, Flink and Spark streaming. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, pp. 1789–1792, May 2016 Chintapalli, S., Dagit, D., Evans, B., Farivar, R., Graves, T., Holderbaugh, M., Liu, Z., Nusbaum, K., Patil, K., Peng, B.J., Poulos, P.: Benchmarking streaming computation engines: Storm, Flink and Spark streaming. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, pp. 1789–1792, May 2016
5.
Zurück zum Zitat van Hoorn, A., Waller, J., Hasselbring, W.: Kieker: a framework for application performance monitoring and dynamic software analysis. In: Proceedings of 3rd ACM/SPEC International Conference on Performance Engineering, ICPE 2012, pp. 247–248. ACM, New York (2012) van Hoorn, A., Waller, J., Hasselbring, W.: Kieker: a framework for application performance monitoring and dynamic software analysis. In: Proceedings of 3rd ACM/SPEC International Conference on Performance Engineering, ICPE 2012, pp. 247–248. ACM, New York (2012)
6.
Zurück zum Zitat McSherry, F., Isard, M., Murray, D.G.: Scalability! But at what cost? In: Proceedings of 15th USENIX Conference on Hot Topics in Operating Systems, p. 14, May 2015 McSherry, F., Isard, M., Murray, D.G.: Scalability! But at what cost? In: Proceedings of 15th USENIX Conference on Hot Topics in Operating Systems, p. 14, May 2015
8.
Zurück zum Zitat Treibig, J., Hager, G., Wellein, G.: LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments. In: Proceedings of First International Workshop on Parallel Software Tools and Tool Infrastructures, PSTI 2010, San Diego, CA (2010) Treibig, J., Hager, G., Wellein, G.: LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments. In: Proceedings of First International Workshop on Parallel Software Tools and Tool Infrastructures, PSTI 2010, San Diego, CA (2010)
9.
Zurück zum Zitat Wang, J., Zhang, K., Tang, X., Hua, B.: B-queue: efficient and practical queuing for fast core-to-core communication. Int. J. Parallel Prog. 41(1), 137–159 (2013)CrossRef Wang, J., Zhang, K., Tang, X., Hua, B.: B-queue: efficient and practical queuing for fast core-to-core communication. Int. J. Parallel Prog. 41(1), 137–159 (2013)CrossRef
Metadaten
Titel
Scalability and State: A Critical Assessment of Throughput Obtainable on Big Data Streaming Frameworks for Applications With and Without State Information
verfasst von
Shinhyung Yang
Yonguk Jeong
ChangWan Hong
Hyunje Jun
Bernd Burgstaller
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-75178-8_12

Neuer Inhalt