Skip to main content
Top
Published in: The VLDB Journal 5/2018

30-08-2017 | Special Issue Paper

Many-query join: efficient shared execution of relational joins on modern hardware

Authors: Darko Makreshanski, Georgios Giannikis, Gustavo Alonso, Donald Kossmann

Published in: The VLDB Journal | Issue 5/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Database architectures typically process queries one at a time, executing concurrent queries in independent execution contexts. Often, such a design leads to unpredictable performance and poor scalability. One approach to circumvent the problem is to take advantage of sharing opportunities across concurrently running queries. In this paper, we propose many-query join (MQJoin), a novel method for sharing the execution of a join that can efficiently deal with hundreds of concurrent queries. This is achieved by minimizing redundant work and making efficient use of main-memory bandwidth and multi-core architectures. Compared to existing proposals, MQJoin is able to efficiently handle larger workloads regardless of the schema by exploiting more sharing opportunities. We also compared MQJoin to two commercial main-memory column-store databases. For a TPC-H-based workload, we show that MQJoin provides 2–5\(\times \) higher throughput with significantly more stable response times.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
Literature
2.
go back to reference Albutiu, M.-C., Kemper, A., Neumann, T.: Massively parallel sort-merge joins in main memory multi-core database systems. PVLDB 5(10), 1064–1075 (2012) Albutiu, M.-C., Kemper, A., Neumann, T.: Massively parallel sort-merge joins in main memory multi-core database systems. PVLDB 5(10), 1064–1075 (2012)
3.
go back to reference Arumugam, S., Dobra, A., Jermaine, C.M., Pansare, N., Perez, L.: The DataPath system: a data-centric analytic processing engine for large data warehouses. Proc. SIGMOD 2010, 519–530 (2010) Arumugam, S., Dobra, A., Jermaine, C.M., Pansare, N., Perez, L.: The DataPath system: a data-centric analytic processing engine for large data warehouses. Proc. SIGMOD 2010, 519–530 (2010)
4.
go back to reference Avnur, R., Hellerstein, J.M.: Eddies: continuously adaptive query processing. Proc. SIGMOD 2000, 261–272 (2000) Avnur, R., Hellerstein, J.M.: Eddies: continuously adaptive query processing. Proc. SIGMOD 2000, 261–272 (2000)
5.
go back to reference Balkesen, C., Alonso, G., Teubner, J., Özsu, M.T.: Multi-core, main-memory joins: sort versus hash revisited. PVLDB 7(1), 85–96 (2013) Balkesen, C., Alonso, G., Teubner, J., Özsu, M.T.: Multi-core, main-memory joins: sort versus hash revisited. PVLDB 7(1), 85–96 (2013)
6.
go back to reference Balkesen, C., Teubner, J., Alonso, G., Özsu, M.T.: Main-memory hash joins on multi-core CPUs: tuning to the underlying hardware. Proc. ICDE 2013, 362–373 (2013) Balkesen, C., Teubner, J., Alonso, G., Özsu, M.T.: Main-memory hash joins on multi-core CPUs: tuning to the underlying hardware. Proc. ICDE 2013, 362–373 (2013)
7.
go back to reference Balkesen, C., Teubner, J., Alonso, G., Özsu, T.: Main-memory hash joins on modern processor architectures. IEEE Trans. Knowl. Data Eng. 27(7), 1754–1766 (2015)CrossRef Balkesen, C., Teubner, J., Alonso, G., Özsu, T.: Main-memory hash joins on modern processor architectures. IEEE Trans. Knowl. Data Eng. 27(7), 1754–1766 (2015)CrossRef
8.
go back to reference Barber, R., Lohman, G., Pandis, I., Raman, V., Sidle, R., Attaluri, G., Chainani, N., Lightstone, S., Sharpe, D.: Memory-efficient hash joins. Proc. VLDB 8(4), 353–364 (2014)CrossRef Barber, R., Lohman, G., Pandis, I., Raman, V., Sidle, R., Attaluri, G., Chainani, N., Lightstone, S., Sharpe, D.: Memory-efficient hash joins. Proc. VLDB 8(4), 353–364 (2014)CrossRef
9.
go back to reference Blanas, S., Li, Y., Patel, J.M.: Design and evaluation of main memory hash join algorithms for multi-core CPUs. Proc. SIGMOD 2011, 37–48 (2011) Blanas, S., Li, Y., Patel, J.M.: Design and evaluation of main memory hash join algorithms for multi-core CPUs. Proc. SIGMOD 2011, 37–48 (2011)
10.
go back to reference Boncz, P.A., Zukowski, M., Nes, N.: MonetDB/X100: hyper-pipelining query execution. Proc. CIDR 2005, 225–237 (2005) Boncz, P.A., Zukowski, M., Nes, N.: MonetDB/X100: hyper-pipelining query execution. Proc. CIDR 2005, 225–237 (2005)
11.
go back to reference Candea, G., Polyzotis, N., Vingralek, R.: A scalable, predictable join operator for highly concurrent data warehouses. PVLDB 2(1), 277–288 (2009) Candea, G., Polyzotis, N., Vingralek, R.: A scalable, predictable join operator for highly concurrent data warehouses. PVLDB 2(1), 277–288 (2009)
12.
go back to reference Chen, C., Roussopoulos, N.: The implementation and performance evaluation of the ADMS query optimizer: integrating query result caching and matching. In: Proc EDBT, pp. 323–336 (1994) Chen, C., Roussopoulos, N.: The implementation and performance evaluation of the ADMS query optimizer: integrating query result caching and matching. In: Proc EDBT, pp. 323–336 (1994)
13.
go back to reference Chen, S., Ailamaki, A., Gibbons, P. B., Mowry, T. C.: Improving hash join performance through prefetching. In: Proc. ICDE 2004, pp. 116– (2004) Chen, S., Ailamaki, A., Gibbons, P. B., Mowry, T. C.: Improving hash join performance through prefetching. In: Proc. ICDE 2004, pp. 116– (2004)
14.
go back to reference Chen, S., Ailamaki, A., Gibbons, P.B., Mowry, T.C.: Improving hash join performance through prefetching. ACM Trans. Database Syst. 32(3), 17 (2007)CrossRef Chen, S., Ailamaki, A., Gibbons, P.B., Mowry, T.C.: Improving hash join performance through prefetching. ACM Trans. Database Syst. 32(3), 17 (2007)CrossRef
15.
go back to reference Ebenstein, R., Kamat, N., Nandi, A.: FluxQuery: an execution framework for highly interactive query workloads. In: Proc SIGMOD, pp. 1333–1345. ACM, New York, NY, USA (2016) Ebenstein, R., Kamat, N., Nandi, A.: FluxQuery: an execution framework for highly interactive query workloads. In: Proc SIGMOD, pp. 1333–1345. ACM, New York, NY, USA (2016)
16.
go back to reference Giannikis, G., Alonso, G., Kossmann, D.: SharedDB: killing one thousand queries with one stone. PVLDB 5(6), 526–537 (2012) Giannikis, G., Alonso, G., Kossmann, D.: SharedDB: killing one thousand queries with one stone. PVLDB 5(6), 526–537 (2012)
17.
go back to reference Giannikis, G., Makreshanski, D., Alonso, G., Kossmann, D.: Shared workload optimization. PVLDB 7(6), 429–440 (2014) Giannikis, G., Makreshanski, D., Alonso, G., Kossmann, D.: Shared workload optimization. PVLDB 7(6), 429–440 (2014)
18.
go back to reference Graefe, G.: Volcano&#151 an extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng. 6(1), 120–135 (1994)CrossRef Graefe, G.: Volcano&#151 an extensible and parallel query evaluation system. IEEE Trans. Knowl. Data Eng. 6(1), 120–135 (1994)CrossRef
19.
go back to reference Harizopoulos, S., Ailamaki, A.: StagedDB: designing database servers for modern hardware. In: In IEEE Data, pp. 11–16 (2005) Harizopoulos, S., Ailamaki, A.: StagedDB: designing database servers for modern hardware. In: In IEEE Data, pp. 11–16 (2005)
20.
go back to reference Harizopoulos, S., Shkapenyuk, V., Ailamaki, A.: QPipe: a simultaneously pipelined relational query engine. Proc. SIGMOD 2005, 383–394 (2005) Harizopoulos, S., Shkapenyuk, V., Ailamaki, A.: QPipe: a simultaneously pipelined relational query engine. Proc. SIGMOD 2005, 383–394 (2005)
21.
go back to reference Ivanova, M. G., Kersten, M. L., Nes, N. J., Gonçalves, R. A.: An architecture for recycling intermediates in a column-store. In: Proc. SIGMOD, pp. 309–320. ACM, New York, NY, USA (2009) Ivanova, M. G., Kersten, M. L., Nes, N. J., Gonçalves, R. A.: An architecture for recycling intermediates in a column-store. In: Proc. SIGMOD, pp. 309–320. ACM, New York, NY, USA (2009)
22.
go back to reference Jha, S., He, B., Lu, M., Cheng, X., Huynh, H.P.: Improving main memory hash joins on intel xeon phi processors: an experimental approach. PVLDB 8(6), 642–653 (2015) Jha, S., He, B., Lu, M., Cheng, X., Huynh, H.P.: Improving main memory hash joins on intel xeon phi processors: an experimental approach. PVLDB 8(6), 642–653 (2015)
23.
go back to reference Johnson, R., Harizopoulos, S., Hardavellas, N., Sabirli, K., Pandis, I., Ailamaki, A., Mancheril, N.G., Falsafi, B.: To share or not to share? Proc. VLDB 2007, 351–362 (2007) Johnson, R., Harizopoulos, S., Hardavellas, N., Sabirli, K., Pandis, I., Ailamaki, A., Mancheril, N.G., Falsafi, B.: To share or not to share? Proc. VLDB 2007, 351–362 (2007)
24.
go back to reference Kim, C., Kaldewey, T., Lee, V.W., Sedlar, E., Nguyen, A.D., Satish, N., Chhugani, J., Di Blas, A., Dubey, P.: Sort versus hash revisited: fast join implementation on modern multi-core CPUs. PVLDB 2(2), 1378–1389 (2009) Kim, C., Kaldewey, T., Lee, V.W., Sedlar, E., Nguyen, A.D., Satish, N., Chhugani, J., Di Blas, A., Dubey, P.: Sort versus hash revisited: fast join implementation on modern multi-core CPUs. PVLDB 2(2), 1378–1389 (2009)
25.
go back to reference Krikellas, K., Inc, G., Viglas, S. D., Cintra, M.: Modeling multithreaded query execution on chip multiprocessors. In ADMS (2010) Krikellas, K., Inc, G., Viglas, S. D., Cintra, M.: Modeling multithreaded query execution on chip multiprocessors. In ADMS (2010)
26.
go back to reference Lang, C.A., Bhattacharjee, B., Malkemus, T., Padmanabhan, S., Wong, K.: Increasing buffer-locality for multiple relational table scans through grouping and throttling. Proc. ICDE 2007, 1136–1145 (2007) Lang, C.A., Bhattacharjee, B., Malkemus, T., Padmanabhan, S., Wong, K.: Increasing buffer-locality for multiple relational table scans through grouping and throttling. Proc. ICDE 2007, 1136–1145 (2007)
27.
go back to reference Lang, C. A., Bhattacharjee, B., Malkemus, T., Wong, K.: Increasing buffer-locality for multiple index based scans through intelligent placement and index scan speed control. In: Proc. VLDB, pp. 1298–1309 (2007) Lang, C. A., Bhattacharjee, B., Malkemus, T., Wong, K.: Increasing buffer-locality for multiple index based scans through intelligent placement and index scan speed control. In: Proc. VLDB, pp. 1298–1309 (2007)
28.
go back to reference Lang, H., Mühlbauer, T., Funke, F., Boncz, P. A., Neumann, T., Kemper, A.: Data blocks: hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. In: Proceedings of the 2016 International Conference on Management of Data, SIGMOD ’16, pp. 311–326. ACM, New York, NY, USA (2016) Lang, H., Mühlbauer, T., Funke, F., Boncz, P. A., Neumann, T., Kemper, A.: Data blocks: hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. In: Proceedings of the 2016 International Conference on Management of Data, SIGMOD ’16, pp. 311–326. ACM, New York, NY, USA (2016)
29.
go back to reference Larson, P.-A., Birka, A., Hanson, E.N., Huang, W., Nowakiewicz, M., Papadimos, V.: Real-time analytical processing with SQL server. Proc. VLDB 8(12), 1740–1751 (2015)CrossRef Larson, P.-A., Birka, A., Hanson, E.N., Huang, W., Nowakiewicz, M., Papadimos, V.: Real-time analytical processing with SQL server. Proc. VLDB 8(12), 1740–1751 (2015)CrossRef
30.
go back to reference Liu, F., Blanas, S.: Forecasting the cost of processing multi-join queries via hashing for main-memory databases. In Proc. SoCC, pp. 153–166. ACM, New York, NY, USA (2015) Liu, F., Blanas, S.: Forecasting the cost of processing multi-join queries via hashing for main-memory databases. In Proc. SoCC, pp. 153–166. ACM, New York, NY, USA (2015)
31.
go back to reference Makreshanski, D., Giceva, J., Barthels, C., Alonso, G.: BatchDB: efficient isolated execution of hybrid OLTP+OLAP workloads for interactive applications. In: Proc. SIGMOD, pp. 37–50. ACM, New York, NY, USA (2017) Makreshanski, D., Giceva, J., Barthels, C., Alonso, G.: BatchDB: efficient isolated execution of hybrid OLTP+OLAP workloads for interactive applications. In: Proc. SIGMOD, pp. 37–50. ACM, New York, NY, USA (2017)
32.
go back to reference Manegold, S., Boncz, P., Kersten, M.: Optimizing main-memory join on modern hardware. IEEE Trans. Knowl. Data Eng. 14(4), 709–730 (2002)CrossRef Manegold, S., Boncz, P., Kersten, M.: Optimizing main-memory join on modern hardware. IEEE Trans. Knowl. Data Eng. 14(4), 709–730 (2002)CrossRef
33.
go back to reference Manegold, S., Boncz, P., Kersten, M. L.: Generic database cost models for hierarchical memory systems. In: Proc VLDB, pp. 191–202. VLDB Endowment (2002) Manegold, S., Boncz, P., Kersten, M. L.: Generic database cost models for hierarchical memory systems. In: Proc VLDB, pp. 191–202. VLDB Endowment (2002)
34.
go back to reference Manegold, S., Pellenkoft, A., Kersten, M. L.: A multi-query optimizer for Monet. In: Proc. BNCOD, pp. 36–50. Springer, London, UK (2000) Manegold, S., Pellenkoft, A., Kersten, M. L.: A multi-query optimizer for Monet. In: Proc. BNCOD, pp. 36–50. Springer, London, UK (2000)
35.
go back to reference Müller, I., Sanders, P., Lacurie, A., Lehner, W., Färber, F.: Cache-efficient aggregation: hashing is sorting. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Proc. SIGMOD 2015, pp. 1123–1136. ACM, New York, NY, USA (2015) Müller, I., Sanders, P., Lacurie, A., Lehner, W., Färber, F.: Cache-efficient aggregation: hashing is sorting. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Proc. SIGMOD 2015, pp. 1123–1136. ACM, New York, NY, USA (2015)
36.
go back to reference O’Neil, P., Graefe, G.: Multi-table joins through bitmapped join indices. SIGMOD Rec. 24(3), 8–11 (1995)CrossRef O’Neil, P., Graefe, G.: Multi-table joins through bitmapped join indices. SIGMOD Rec. 24(3), 8–11 (1995)CrossRef
38.
go back to reference Psaroudakis, I., Athanassoulis, M., Ailamaki, A.: Sharing data and work across concurrent analytical queries. PVLDB 6(9), 637–648 (2013) Psaroudakis, I., Athanassoulis, M., Ailamaki, A.: Sharing data and work across concurrent analytical queries. PVLDB 6(9), 637–648 (2013)
39.
go back to reference Qiao, L., Raman, V., Reiss, F., Haas, P.J., Lohman, G.M.: Main-memory scan sharing for multi-core CPUs. PVLDB 1(1), 610–621 (2008) Qiao, L., Raman, V., Reiss, F., Haas, P.J., Lohman, G.M.: Main-memory scan sharing for multi-core CPUs. PVLDB 1(1), 610–621 (2008)
40.
go back to reference Raman, V., Attaluri, G., Barber, R., Chainani, N., Kalmuk, D., KulandaiSamy, V., Leenstra, J., Lightstone, S., Liu, S., Lohman, G.M., Malkemus, T., Mueller, R., Pandis, I., Schiefer, B., Sharpe, D., Sidle, R., Storm, A., Zhang, L.: DB2 with BLU acceleration: so much more than just a column store. Proc. VLDB 6(11), 1080–1091 (2013)CrossRef Raman, V., Attaluri, G., Barber, R., Chainani, N., Kalmuk, D., KulandaiSamy, V., Leenstra, J., Lightstone, S., Liu, S., Lohman, G.M., Malkemus, T., Mueller, R., Pandis, I., Schiefer, B., Sharpe, D., Sidle, R., Storm, A., Zhang, L.: DB2 with BLU acceleration: so much more than just a column store. Proc. VLDB 6(11), 1080–1091 (2013)CrossRef
41.
go back to reference Raman, V., Swart, G., Qiao, L., Reiss, F., Dialani, V., Kossmann, D., Narang, I., Sidle, R.: Constant-time query processing. In: Proc. ICDE 2008, pp. 60–69 (2008) Raman, V., Swart, G., Qiao, L., Reiss, F., Dialani, V., Kossmann, D., Narang, I., Sidle, R.: Constant-time query processing. In: Proc. ICDE 2008, pp. 60–69 (2008)
42.
go back to reference Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient and extensible algorithms for multi query optimization. In: Proc. SIGMOD, pp. 249–260. ACM, New York, NY, USA (2000) Roy, P., Seshadri, S., Sudarshan, S., Bhobe, S.: Efficient and extensible algorithms for multi query optimization. In: Proc. SIGMOD, pp. 249–260. ACM, New York, NY, USA (2000)
43.
go back to reference Răducanu, B., Boncz, P., Zukowski, M.: Micro adaptivity in vectorwise. In: Proc. SIGMOD, pp. 1231–1242. ACM, New York, NY, USA (2013) Răducanu, B., Boncz, P., Zukowski, M.: Micro adaptivity in vectorwise. In: Proc. SIGMOD, pp. 1231–1242. ACM, New York, NY, USA (2013)
44.
go back to reference Sellis, T.K.: Multiple-query optimization. ACM Trans. Database Syst. 13(1), 23–52 (1988)CrossRef Sellis, T.K.: Multiple-query optimization. ACM Trans. Database Syst. 13(1), 23–52 (1988)CrossRef
45.
go back to reference Shatdal, A., Kant, C., Naughton, J.F.: Cache conscious algorithms for relational query processing. Proc. VLDB 1994, 510–521 (1994) Shatdal, A., Kant, C., Naughton, J.F.: Cache conscious algorithms for relational query processing. Proc. VLDB 1994, 510–521 (1994)
46.
go back to reference Sodani, A.: Knights landing (knl): 2nd generation intel(r) xeon phi processor. In: 2015 IEEE Hot Chips 27 Symposium (HCS), pp. 1–24 (Aug 2015) Sodani, A.: Knights landing (knl): 2nd generation intel(r) xeon phi processor. In: 2015 IEEE Hot Chips 27 Symposium (HCS), pp. 1–24 (Aug 2015)
47.
go back to reference Unterbrunner, P., Giannikis, G., Alonso, G., Fauser, D., Kossmann, D.: Predictable performance for unpredictable workloads. PVLDB 2(1), 706–717 (2009)CrossRef Unterbrunner, P., Giannikis, G., Alonso, G., Fauser, D., Kossmann, D.: Predictable performance for unpredictable workloads. PVLDB 2(1), 706–717 (2009)CrossRef
48.
go back to reference Valduriez, P.: Join indices. ACM Trans. Database Syst. 12(2), 218–246 (1987)CrossRef Valduriez, P.: Join indices. ACM Trans. Database Syst. 12(2), 218–246 (1987)CrossRef
49.
go back to reference Zukowski, M., Héman, S., Nes, N., Boncz, P.: Cooperative scans: dynamic bandwidth sharing in a DBMS. Proc. VLDB 2007, 723–734 (2007) Zukowski, M., Héman, S., Nes, N., Boncz, P.: Cooperative scans: dynamic bandwidth sharing in a DBMS. Proc. VLDB 2007, 723–734 (2007)
50.
go back to reference Zukowski, M., Nes, N., Boncz, P.: DSM versus NSM: CPU performance tradeoffs in block-oriented query processing. In: Proc. DaMoN 2008, pp. 47–54 (2008) Zukowski, M., Nes, N., Boncz, P.: DSM versus NSM: CPU performance tradeoffs in block-oriented query processing. In: Proc. DaMoN 2008, pp. 47–54 (2008)
51.
go back to reference Zukowski, M., van de Wiel, M., Boncz, P.: Vectorwise: a vectorized analytical DBMS. Proc. ICDE 2012, 1349–1350 (2012) Zukowski, M., van de Wiel, M., Boncz, P.: Vectorwise: a vectorized analytical DBMS. Proc. ICDE 2012, 1349–1350 (2012)
Metadata
Title
Many-query join: efficient shared execution of relational joins on modern hardware
Authors
Darko Makreshanski
Georgios Giannikis
Gustavo Alonso
Donald Kossmann
Publication date
30-08-2017
Publisher
Springer Berlin Heidelberg
Published in
The VLDB Journal / Issue 5/2018
Print ISSN: 1066-8888
Electronic ISSN: 0949-877X
DOI
https://doi.org/10.1007/s00778-017-0475-4

Other articles of this Issue 5/2018

The VLDB Journal 5/2018 Go to the issue

Premium Partner