Skip to main content
Erschienen in:
Buchtitelbild

2018 | OriginalPaper | Buchkapitel

Network-Aware Grouping in Distributed Stream Processing Systems

verfasst von : Fei Chen, Song Wu, Hai Jin

Erschienen in: Algorithms and Architectures for Parallel Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Distributed Stream Processing (DSP) systems have recently attracted much attention because of their ability to process huge volumes of real-time stream data with very low latency on clusters of commodity hardware. Existing workload grouping strategies in a DSP system can be classified into four categories (i.e. raw and blind, data skewness, cluster heterogeneity, and dynamic load-aware). However, these traditional stream grouping strategies do not consider network distance between two communicating operators. In fact, the traffic from different network channels makes a significant impact on performance. How to grouping tuples according to network distances to improve performance has been a critical problem.
In this paper, we propose a network-aware grouping framework called Squirrel to improve the performance under different network distances. Identifying the network location of two communicating operators, Squirrel sets a weight and priority for each network channel. It introduces Weight Grouping to assign different numbers of tuples to each network channel according to channel’s weight and priority. In order to adapt to changes in network conditions, input load, resources and other factors, Squirrel uses Dynamic Weight Control to adjust network channel’s weight and priority online by analyzing runtime information. Experimental results prove Squirrel’s effectiveness and show that Squirrel can achieve 1.67x improvement in terms of throughput and reduce the latency by 47%.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Aniello, L., Baldoni, R., Querzoni, L.: Adaptive online scheduling in storm. In: Proceedings of DEBS 2013, pp. 207–218 (2013) Aniello, L., Baldoni, R., Querzoni, L.: Adaptive online scheduling in storm. In: Proceedings of DEBS 2013, pp. 207–218 (2013)
3.
Zurück zum Zitat Caneill, M., EI Rheddane, A., Leroy, V., De Palma, N.: Locality-aware routing in stateful streaming applications. In: Proceedings of Middleware 2016, pp. 1–13 (2016) Caneill, M., EI Rheddane, A., Leroy, V., De Palma, N.: Locality-aware routing in stateful streaming applications. In: Proceedings of Middleware 2016, pp. 1–13 (2016)
4.
Zurück zum Zitat Carbone, P., Ewen, S., Haridi, S.: Apache flink: stream and batch processing in a single engine. Bull. IEEE Comput. Soc. Tech. Comm. Data Eng. 36(4), 28–38 (2015) Carbone, P., Ewen, S., Haridi, S.: Apache flink: stream and batch processing in a single engine. Bull. IEEE Comput. Soc. Tech. Comm. Data Eng. 36(4), 28–38 (2015)
5.
Zurück zum Zitat Chen, H., Zhang, F., Jin, H.: Popularity-aware differentiated distributed stream processing on skewed streams. In: Proceedings of ICNP 2017, pp. 1–10 (2017) Chen, H., Zhang, F., Jin, H.: Popularity-aware differentiated distributed stream processing on skewed streams. In: Proceedings of ICNP 2017, pp. 1–10 (2017)
6.
Zurück zum Zitat Chintapalli, S., et al.: Benchmarking streaming computation engines: storm, flink and spark streaming. In: Proceedings of IPDPSW 2016, pp. 1789–1792 (2016) Chintapalli, S., et al.: Benchmarking streaming computation engines: storm, flink and spark streaming. In: Proceedings of IPDPSW 2016, pp. 1789–1792 (2016)
7.
Zurück zum Zitat Fang, J., Zhang, R., Fu, T., Zhang, Z., Zhou, A., Zhu, J.: Parallel stream processing against workload skewness and variance. In: Proceedings of HPDC 2017, pp. 15–26 (2017) Fang, J., Zhang, R., Fu, T., Zhang, Z., Zhou, A., Zhu, J.: Parallel stream processing against workload skewness and variance. In: Proceedings of HPDC 2017, pp. 15–26 (2017)
8.
Zurück zum Zitat Kulkarni, S., et al.: Twitter heron: stream processing at scale. In: Proceedings of SIGMOD 2015, pp. 239–250 (2015) Kulkarni, S., et al.: Twitter heron: stream processing at scale. In: Proceedings of SIGMOD 2015, pp. 239–250 (2015)
9.
Zurück zum Zitat Murray, D., McSherry, F., Isaacs, R., Isard, M., Barham, P., Abadi, M.: Naiad: a timely dataflow system. In: Proceedings of SOSP 2013, pp. 439–455 (2013) Murray, D., McSherry, F., Isaacs, R., Isard, M., Barham, P., Abadi, M.: Naiad: a timely dataflow system. In: Proceedings of SOSP 2013, pp. 439–455 (2013)
11.
Zurück zum Zitat Nasir, M.A.U., Morales, G.D.F., Garcia-Soriano, D., Kourtellis, N., Serafini, M.: The power of both choices: practical load balancing for distributed stream processing engines. In: Proceedings of ICDE 2015, pp. 137–148 (2015) Nasir, M.A.U., Morales, G.D.F., Garcia-Soriano, D., Kourtellis, N., Serafini, M.: The power of both choices: practical load balancing for distributed stream processing engines. In: Proceedings of ICDE 2015, pp. 137–148 (2015)
12.
13.
Zurück zum Zitat Nasir, M.A.U., Morales, G.D.F., Kourtellis, N., Serafini, M.: When two choices are not enough: balancing at scale in distributed stream processing. In: Proceedings of ICDE 2016, pp. 589–600 (2016) Nasir, M.A.U., Morales, G.D.F., Kourtellis, N., Serafini, M.: When two choices are not enough: balancing at scale in distributed stream processing. In: Proceedings of ICDE 2016, pp. 589–600 (2016)
14.
Zurück zum Zitat Peng, B., Hosseini, M., Hong, Z., Farivar, R., Campbell, R.: R-storm: resource-aware scheduling in storm. In: Proceedings of Middleware 2015, pp. 149–161 (2015) Peng, B., Hosseini, M., Hong, Z., Farivar, R., Campbell, R.: R-storm: resource-aware scheduling in storm. In: Proceedings of Middleware 2015, pp. 149–161 (2015)
15.
Zurück zum Zitat Rivetti, N., Anceaume, E., Busnel, Y., Querzoni, L., Sericola, B.: Online scheduling for shuffle grouping in distributed stream processing systems. In: Proceedings of Middleware 2016, pp. 11–22 (2016) Rivetti, N., Anceaume, E., Busnel, Y., Querzoni, L., Sericola, B.: Online scheduling for shuffle grouping in distributed stream processing systems. In: Proceedings of Middleware 2016, pp. 11–22 (2016)
16.
Zurück zum Zitat Rivetti, N., Querzoni, L., Anceaume, E., Busnel, Y., Sericola, B.: Efficient key grouping for near-optimal load balancing in stream processing systems. In: Proceedings of DEBS 2015, pp. 80–91 (2015) Rivetti, N., Querzoni, L., Anceaume, E., Busnel, Y., Sericola, B.: Efficient key grouping for near-optimal load balancing in stream processing systems. In: Proceedings of DEBS 2015, pp. 80–91 (2015)
17.
Zurück zum Zitat Schneider, S., Wolf, J., Hildrum, K., Khandekar, R.: Dynamic load balancing for ordered data-parallel regions in distributed streaming systems. In: Proceedings of Middleware 2016, pp. 21–34 (2016) Schneider, S., Wolf, J., Hildrum, K., Khandekar, R.: Dynamic load balancing for ordered data-parallel regions in distributed streaming systems. In: Proceedings of Middleware 2016, pp. 21–34 (2016)
18.
Zurück zum Zitat Toshniwal, A., et al.: Storm @twitter. In: Proceedings of SIGMOD 2014, pp. 147–156 (2014) Toshniwal, A., et al.: Storm @twitter. In: Proceedings of SIGMOD 2014, pp. 147–156 (2014)
19.
Zurück zum Zitat Xu, J., Chen, Z., Tang, J., Su, S.: T-storm: traffic-aware online scheduling in storm. In: Proceedings of ICDCS 2014, pp. 535–544 (2014) Xu, J., Chen, Z., Tang, J., Su, S.: T-storm: traffic-aware online scheduling in storm. In: Proceedings of ICDCS 2014, pp. 535–544 (2014)
Metadaten
Titel
Network-Aware Grouping in Distributed Stream Processing Systems
verfasst von
Fei Chen
Song Wu
Hai Jin
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-05051-1_1

Premium Partner