Skip to main content
Top

2017 | OriginalPaper | Chapter

Replica-Aware Partitioning Design in Parallel Database Systems

Authors : Liming Dong, Weidong Liu, Renchuan Li, Tiejun Zhang, Weiguo Zhao

Published in: Euro-Par 2017: Parallel Processing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In parallel database systems, data is partitioned and replicated across multiple independent nodes to improve system performance and increase robustness. In current practice of database partitioning design, all replicas are uniformly partitioned, however, different statements may prefer contradictory partitioning plans, so a single plan cannot achieve the overall optimal performance for the workload.
In this paper, we propose a novel approach of replica-aware data partitioning design to address the contradictions. According to the access graph of SQL statements, we use the k-medoids algorithm to classify workload into statement clusters, then we use the branch-and-bound algorithm to search for the optimal partitioning plan for each cluster. Finally, we organize replicas with these plans, and route statements to their preferred replicas. We use TPC-E, TPC-H and National College and University Enrollment System (NACUES) to evaluate our approach. The evaluation results demonstrate that our approach improves system performance by up to 4x over the current practice of partitioning design.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Agrawal, S., Chaudhuri, S., Das, A., Narasayya, V.: Automating layout of relational databases. In: Proceedings 19th International Conference on Data Engineering, pp. 607–618. IEEE (2003) Agrawal, S., Chaudhuri, S., Das, A., Narasayya, V.: Automating layout of relational databases. In: Proceedings 19th International Conference on Data Engineering, pp. 607–618. IEEE (2003)
3.
go back to reference Agrawal, S., Narasayya, V., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 359–370. ACM (2004) Agrawal, S., Narasayya, V., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 359–370. ACM (2004)
4.
go back to reference Beach, B.: Relational database service. In: Beach, B. (ed.) Pro Powershell for Amazon Web Services, pp. 155–178. Springer, Heidelberg (2014) Beach, B.: Relational database service. In: Beach, B. (ed.) Pro Powershell for Amazon Web Services, pp. 155–178. Springer, Heidelberg (2014)
5.
go back to reference Bernstein, P.A., Cseri, I., Dani, N., Ellis, N., Kalhan, A., Kakivaya, G., Lomet, D.B., Manne, R., Novik, L., Talius, T.: Adapting Microsoft SQL server for cloud computing. In: Proceedings of the 27th International Conference on Data Engineering (ICDE), pp. 1255–1263. IEEE (2011) Bernstein, P.A., Cseri, I., Dani, N., Ellis, N., Kalhan, A., Kakivaya, G., Lomet, D.B., Manne, R., Novik, L., Talius, T.: Adapting Microsoft SQL server for cloud computing. In: Proceedings of the 27th International Conference on Data Engineering (ICDE), pp. 1255–1263. IEEE (2011)
6.
go back to reference Bronson, N., Amsden, Z., Cabrera, G., Chakka, P., Dimov, P., Ding, H., Ferris, J., Giardullo, A., Kulkarni, S., Li, H.C., et al.: Tao: Facebook’s distributed data store for the social graph. In: Proceedings of USENIX Annual Technical Conference, pp. 49–60 (2013) Bronson, N., Amsden, Z., Cabrera, G., Chakka, P., Dimov, P., Ding, H., Ferris, J., Giardullo, A., Kulkarni, S., Li, H.C., et al.: Tao: Facebook’s distributed data store for the social graph. In: Proceedings of USENIX Annual Technical Conference, pp. 49–60 (2013)
7.
go back to reference Consens, M.P., Ioannidou, K., LeFevre, J., Polyzotis, N.: Divergent physical design tuning for replicated databases. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 49–60. ACM (2012) Consens, M.P., Ioannidou, K., LeFevre, J., Polyzotis, N.: Divergent physical design tuning for replicated databases. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 49–60. ACM (2012)
8.
go back to reference Curino, C., Jones, E., Zhang, Y., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. Proc. VLDB Endow. 3, 48–57 (2010)CrossRef Curino, C., Jones, E., Zhang, Y., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. Proc. VLDB Endow. 3, 48–57 (2010)CrossRef
9.
go back to reference Goasdoué, F., Kaoudi, Z., Manolescu, I., Quiané-Ruiz, J.A., Zampetakis, S.: CliqueSquare: flat plans for massively parallel RDF queries. In: Proceedings of the 31st International Conference on Data Engineering (ICDE), pp. 771–782. IEEE (2015) Goasdoué, F., Kaoudi, Z., Manolescu, I., Quiané-Ruiz, J.A., Zampetakis, S.: CliqueSquare: flat plans for massively parallel RDF queries. In: Proceedings of the 31st International Conference on Data Engineering (ICDE), pp. 771–782. IEEE (2015)
10.
go back to reference Holmes, D.E., Jain, L.C.: Data Mining: Foundations and Intelligent Paradigms. Springer Publishing Company, Heidelberg (2012)MATH Holmes, D.E., Jain, L.C.: Data Mining: Foundations and Intelligent Paradigms. Springer Publishing Company, Heidelberg (2012)MATH
11.
go back to reference Jindal, A., Quiané-Ruiz, J.A., Dittrich, J.: Trojan data layouts: right shoes for a running elephant. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, p. 21. ACM (2011) Jindal, A., Quiané-Ruiz, J.A., Dittrich, J.: Trojan data layouts: right shoes for a running elephant. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, p. 21. ACM (2011)
12.
go back to reference Nehme, R., Bruno, N.: Automated partitioning design in parallel database systems. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pp. 1137–1148. ACM (2011) Nehme, R., Bruno, N.: Automated partitioning design in parallel database systems. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pp. 1137–1148. ACM (2011)
13.
go back to reference Pavlo, A., Curino, C., Zdonik, S.: Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 61–72. ACM (2012) Pavlo, A., Curino, C., Zdonik, S.: Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 61–72. ACM (2012)
14.
go back to reference Rao, J., Zhang, C., Megiddo, N., Lohman, G.: Automating physical database design in a parallel database. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 558–569. ACM (2002) Rao, J., Zhang, C., Megiddo, N., Lohman, G.: Automating physical database design in a parallel database. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 558–569. ACM (2002)
15.
go back to reference Serafini, M., Mansour, E., Aboulnaga, A., Salem, K., Rafiq, T., Minhas, U.F.: Accordion: elastic scalability for database systems supporting distributed transactions. Proc. VLDB Endow. 7, 1035–1046 (2014)CrossRef Serafini, M., Mansour, E., Aboulnaga, A., Salem, K., Rafiq, T., Minhas, U.F.: Accordion: elastic scalability for database systems supporting distributed transactions. Proc. VLDB Endow. 7, 1035–1046 (2014)CrossRef
16.
go back to reference Zilio, D.C.: Physical database design decision algorithms and concurrent reorganization for parallel database systems. Ph.D. thesis. Citeseer (1998) Zilio, D.C.: Physical database design decision algorithms and concurrent reorganization for parallel database systems. Ph.D. thesis. Citeseer (1998)
Metadata
Title
Replica-Aware Partitioning Design in Parallel Database Systems
Authors
Liming Dong
Weidong Liu
Renchuan Li
Tiejun Zhang
Weiguo Zhao
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-64203-1_22

Premium Partner