Skip to main content

2019 | OriginalPaper | Buchkapitel

Processing of Aggregate Continuous Queries in a Distributed Environment

verfasst von : Anatoli U. Shein, Panos K. Chrysanthis, Alexandros Labrinidis

Erschienen in: Real-Time Business Intelligence and Analytics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data Stream Management Systems (DSMSs) performing online analytics rely on the efficient execution of large numbers of Aggregate Continuous Queries (ACQs). In this paper, we study the problem of generating high quality execution plans of ACQs in DSMSs deployed on multi-node (multi-core and multi-processor) distributed environments. Towards this goal, we classify optimizers based on how they partition the workload among computing nodes and on their usage of the concept of Weavability, which is utilized by the state-of-the-art WeaveShare optimizer to selectively combine ACQs and produce low cost execution plans for single-node environments. For each category, we propose an optimizer, which either adopts an existing strategy or develops a new one for assigning and grouping ACQs to computing nodes. We implement and experimentally compare all of our proposed optimizers in terms of (1) keeping the total cost of the ACQs execution plan low and (2) balancing the load among the computing nodes. Our extensive experimental evaluation shows that our newly developed Weave-Group to Nodes (\(WG_{TN}\)) and Weave-Group Inserted (\(WG_{I}\)) optimizers produce plans of significantly higher quality than the rest of the optimizers. \(WG_{TN}\) minimizes the total cost, making it more suitable from a client perspective, and \(WG_{I}\) achieves load balancing, making it more suitable from a system perspective.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
4.
Zurück zum Zitat Abadi, D.J., et al.: Aurora: a new model and architecture for data stream management. VLDBJ 12(2), 120–139 (2003)CrossRef Abadi, D.J., et al.: Aurora: a new model and architecture for data stream management. VLDBJ 12(2), 120–139 (2003)CrossRef
5.
Zurück zum Zitat Akidau, T., et al.: Millwheel: fault-tolerant stream processing at internet scale. PVLDB 6(3), 1033–1044 (2013) Akidau, T., et al.: Millwheel: fault-tolerant stream processing at internet scale. PVLDB 6(3), 1033–1044 (2013)
6.
Zurück zum Zitat Ananthanarayanan, R., et al.: Photon: fault-tolerant and scalable joining of continuous data streams. In: ACM SIGMOD, pp. 577–588 (2013) Ananthanarayanan, R., et al.: Photon: fault-tolerant and scalable joining of continuous data streams. In: ACM SIGMOD, pp. 577–588 (2013)
7.
Zurück zum Zitat Chrysanthis, P.K.: AQSIOS - next generation data stream management system. CONET Newslett. 9, 1–3 (2010) Chrysanthis, P.K.: AQSIOS - next generation data stream management system. CONET Newslett. 9, 1–3 (2010)
9.
Zurück zum Zitat Condie, T.: Online aggregation and continuous query support in mapreduce. In: ACM SIGMOD, pp. 1115–1118 (2010) Condie, T.: Online aggregation and continuous query support in mapreduce. In: ACM SIGMOD, pp. 1115–1118 (2010)
10.
Zurück zum Zitat Ghanem, T.M., Hammad, M.A., Mokbel, M.F., Aref, W.G., Elmagarmid, A.K.: Incremental evaluation of sliding-window queries over data streams. IEEE TKDE 19(1), 57–72 (2007) Ghanem, T.M., Hammad, M.A., Mokbel, M.F., Aref, W.G., Elmagarmid, A.K.: Incremental evaluation of sliding-window queries over data streams. IEEE TKDE 19(1), 57–72 (2007)
11.
Zurück zum Zitat Guirguis, S., Sharaf, M., Chrysanthis, P.K., Labrinidis, A.: Three-level processing of multiple aggregate continuous queries. In: IEEE ICDE, pp. 929–940 (2012) Guirguis, S., Sharaf, M., Chrysanthis, P.K., Labrinidis, A.: Three-level processing of multiple aggregate continuous queries. In: IEEE ICDE, pp. 929–940 (2012)
12.
Zurück zum Zitat Guirguis, S., Sharaf, M.A., Chrysanthis, P.K., Labrinidis, A.: Optimized processing of multiple aggregate continuous queries. In: ACM CIKM, pp. 357–368 (2011) Guirguis, S., Sharaf, M.A., Chrysanthis, P.K., Labrinidis, A.: Optimized processing of multiple aggregate continuous queries. In: ACM CIKM, pp. 357–368 (2011)
13.
Zurück zum Zitat Gulisano, V., Jimenez-Peris, R., Patino-Martinez, M., Soriente, C., Valduriez, P.: Streamcloud: an elastic and scalable data streaming system. IEEE TPDS 23(12), 2351–2365 (2012) Gulisano, V., Jimenez-Peris, R., Patino-Martinez, M., Soriente, C., Valduriez, P.: Streamcloud: an elastic and scalable data streaming system. IEEE TPDS 23(12), 2351–2365 (2012)
14.
Zurück zum Zitat Katsipoulakis, N.R., Thoma, C., Gratta, E.A., Labrinidis, A., Lee, A.J., Chrysanthis, P.K.: CE-Storm: confidential elastic processing of data streams. In: ACM SIGMOD, pp. 859–864 (2015) Katsipoulakis, N.R., Thoma, C., Gratta, E.A., Labrinidis, A., Lee, A.J., Chrysanthis, P.K.: CE-Storm: confidential elastic processing of data streams. In: ACM SIGMOD, pp. 859–864 (2015)
15.
Zurück zum Zitat Krishnamurthy, S., Wu, C., Franklin, M.: On-the-fly sharing for streamed aggregation. In: ACM SIGMOD, pp. 623–634 (2006) Krishnamurthy, S., Wu, C., Franklin, M.: On-the-fly sharing for streamed aggregation. In: ACM SIGMOD, pp. 623–634 (2006)
16.
Zurück zum Zitat Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: No pane, no gain: efficient evaluation of sliding-window aggregates over data streams. ACM SIGMOD Rec. 34, 39–44 (2005)CrossRef Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: No pane, no gain: efficient evaluation of sliding-window aggregates over data streams. ACM SIGMOD Rec. 34, 39–44 (2005)CrossRef
17.
Zurück zum Zitat Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: Semantics and evaluation techniques for window aggregates in data streams. In: ACM SIGMOD, pp. 311–322 (2005) Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: Semantics and evaluation techniques for window aggregates in data streams. In: ACM SIGMOD, pp. 311–322 (2005)
18.
Zurück zum Zitat Motwani, R., et al.: Query processing, approximation, and resource management in a data stream management system. In: CIDR (2003) Motwani, R., et al.: Query processing, approximation, and resource management in a data stream management system. In: CIDR (2003)
19.
Zurück zum Zitat Naidu, K., Rastogi, R., Satkin, S., Srinivasan, A.: Memory-constrained aggregate computation over data streams. In: IEEE ICDE, pp. 852–863 (2011) Naidu, K., Rastogi, R., Satkin, S., Srinivasan, A.: Memory-constrained aggregate computation over data streams. In: IEEE ICDE, pp. 852–863 (2011)
20.
Zurück zum Zitat Romeijn, H.E., Morales, D.R.: A class of greedy algorithms for the generalized assignment problem. Discrete Appl. Math. 103, 209–235 (2000)MathSciNetCrossRef Romeijn, H.E., Morales, D.R.: A class of greedy algorithms for the generalized assignment problem. Discrete Appl. Math. 103, 209–235 (2000)MathSciNetCrossRef
21.
Zurück zum Zitat Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient and extensible algorithms for multi query optimization. In: ACM SIGMOD, pp. 249–260 (2000)CrossRef Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient and extensible algorithms for multi query optimization. In: ACM SIGMOD, pp. 249–260 (2000)CrossRef
22.
Zurück zum Zitat Sharaf, M.A., Chrysanthis, P.K., Labrinidis, A., Pruhs, K.: Algorithms and metrics for processing multiple heterogeneous continuous queries. IEEE TODS 33, 1–44 (2008)CrossRef Sharaf, M.A., Chrysanthis, P.K., Labrinidis, A., Pruhs, K.: Algorithms and metrics for processing multiple heterogeneous continuous queries. IEEE TODS 33, 1–44 (2008)CrossRef
23.
Zurück zum Zitat Shein, A.U., Chrysanthis, P.K., Labrinidis, A.: F1: accelerating the optimization of aggregate continuous queries. In: ACM CIKM, pp. 1151–1160 (2015) Shein, A.U., Chrysanthis, P.K., Labrinidis, A.: F1: accelerating the optimization of aggregate continuous queries. In: ACM CIKM, pp. 1151–1160 (2015)
24.
Zurück zum Zitat Toshniwal, A., et al.: Storm@ Twitter. In: ACM SIGMOD, pp. 147–156 (2014) Toshniwal, A., et al.: Storm@ Twitter. In: ACM SIGMOD, pp. 147–156 (2014)
25.
Zurück zum Zitat Xing, Y., Zdonik, S., Hwang, J.: Dynamic load distribution in the borealis stream processor. In: IEEE ICDE, pp. 791–802 (2005) Xing, Y., Zdonik, S., Hwang, J.: Dynamic load distribution in the borealis stream processor. In: IEEE ICDE, pp. 791–802 (2005)
26.
Zurück zum Zitat Yao, F., Demers, A., Shenker, S.: A scheduling model for reduced CPU energy. In: ACM FOCS, pp. 374–382 (1995) Yao, F., Demers, A., Shenker, S.: A scheduling model for reduced CPU energy. In: ACM FOCS, pp. 374–382 (1995)
27.
Zurück zum Zitat Zhang, R., Koudas, N., Ooi, B.C., Srivastava, D.: Multiple aggregations over data streams. In: ACM SIGMOD, pp. 299–310 (2005) Zhang, R., Koudas, N., Ooi, B.C., Srivastava, D.: Multiple aggregations over data streams. In: ACM SIGMOD, pp. 299–310 (2005)
28.
Zurück zum Zitat Zhang, R., Koudas, N., Ooi, B.C., Srivastava, D., Zhou, P.: Streaming multiple aggregations using phantoms. VLDBJ 19(4), 557–583 (2010)CrossRef Zhang, R., Koudas, N., Ooi, B.C., Srivastava, D., Zhou, P.: Streaming multiple aggregations using phantoms. VLDBJ 19(4), 557–583 (2010)CrossRef
Metadaten
Titel
Processing of Aggregate Continuous Queries in a Distributed Environment
verfasst von
Anatoli U. Shein
Panos K. Chrysanthis
Alexandros Labrinidis
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-24124-7_4

Neuer Inhalt