Skip to main content

2016 | OriginalPaper | Buchkapitel

Optimizing Inter-data-center Large-Scale Database Parallel Replication with Workload-Driven Partitioning

verfasst von : Zhen Gao, Hong Min, Xiao Li, Jie Huang, Yi Jin, An Lei, Serge Bourbonnais, Miao Zheng, Gene Fuh

Erschienen in: Transactions on Large-Scale Data- and Knowledge-Centered Systems XXIV

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Geographically distributed data centers are deployed for non-stop business operations by many enterprises. In case of disastrous events, ongoing workloads must be failed over from the current data center to another active one within just a few seconds to achieve continuous service availability. Software-based parallel database replication techniques are designed to meet very high throughput with near-real-time latency. Understanding workload characteristics is one of the key factors for improving replication performance. In this paper, we propose a workload-driven method to optimize database replication latency and minimize transaction splits with a minimum of parallel replication consistency groups. Our two-phased approach includes (1) a log-based mechanism for workload pattern discovery; (2) a history-based algorithm on pattern analysis, database partitioning and partition adjustment. The experimental results from a real banking batch workload and a benchmark OLTP workload demonstrate the effectiveness of the solution even for partitioning 1000 s of database tables in very large workloads. Finally, the algorithm to automate the cyclic flow of workload profile capturing and partitioning readjustment is developed and verified.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cecchet, E., Candea, G., Ailamaki, A.: Middleware-based database replication: the gaps between theory and practice. In: SIGMOD (2008) Cecchet, E., Candea, G., Ailamaki, A.: Middleware-based database replication: the gaps between theory and practice. In: SIGMOD (2008)
2.
Zurück zum Zitat Codd, E.F.: The Relational Model for Database Management, Version 2. Addison-Wesley, New York (1990). ISBN: 9780201141924 MATH Codd, E.F.: The Relational Model for Database Management, Version 2. Addison-Wesley, New York (1990). ISBN: 9780201141924 MATH
3.
Zurück zum Zitat Corbett, J.C., et al.: Spanner: Google’s globally-distributed database. In: OSDI (2012) Corbett, J.C., et al.: Spanner: Google’s globally-distributed database. In: OSDI (2012)
4.
Zurück zum Zitat Curino, C., Jones, E., Zhang, Y., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. Proc. VLDB 3, 48–57 (2010)CrossRef Curino, C., Jones, E., Zhang, Y., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. Proc. VLDB 3, 48–57 (2010)CrossRef
5.
Zurück zum Zitat DeCusatis, C.: Handbook of Fiber Optic Data Communication: A Practical Guide to Optical Networking, 4th edn. Academic Press, London (2013). ISBN: 10 0124016731 DeCusatis, C.: Handbook of Fiber Optic Data Communication: A Practical Guide to Optical Networking, 4th edn. Academic Press, London (2013). ISBN: 10 0124016731
6.
Zurück zum Zitat Fiduccia, C.M., Mattheyses, R.M.: A linear-time heuristic for improving network partitions. In: Proceedings of the 19th Design Automation Conference, pp. 175–181, January 1982 Fiduccia, C.M., Mattheyses, R.M.: A linear-time heuristic for improving network partitions. In: Proceedings of the 19th Design Automation Conference, pp. 175–181, January 1982
7.
Zurück zum Zitat Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1990) Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1990)
8.
Zurück zum Zitat Graham, R.L.: Bounds on multiprocessing anomalies and related packing algorithms. In: AFIPS Spring Joint Computing Conference, pp. 205–217 (1972) Graham, R.L.: Bounds on multiprocessing anomalies and related packing algorithms. In: AFIPS Spring Joint Computing Conference, pp. 205–217 (1972)
9.
Zurück zum Zitat Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: SIGMOD (1996) Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: SIGMOD (1996)
10.
Zurück zum Zitat Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)MathSciNetCrossRef Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)MathSciNetCrossRef
11.
Zurück zum Zitat Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing (1998) Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing (1998)
12.
Zurück zum Zitat Kemme, B., Jiménez-Peris, R., Patiño-Martínez, M.: Database replication. Synth. Lect. Data Manag. 5, 1–153 (2010). Morgan & Claypool PublishersCrossRef Kemme, B., Jiménez-Peris, R., Patiño-Martínez, M.: Database replication. Synth. Lect. Data Manag. 5, 1–153 (2010). Morgan & Claypool PublishersCrossRef
13.
Zurück zum Zitat Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Techn. J. 49, 291–307 (1970)MATHCrossRef Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Techn. J. 49, 291–307 (1970)MATHCrossRef
14.
Zurück zum Zitat Lin, Y., Kemme, B., Patiño-Martínez, M., Jiménez-Peris, R.: Middleware based data replication providing snapshot isolation. In: SIGMOD (2005) Lin, Y., Kemme, B., Patiño-Martínez, M., Jiménez-Peris, R.: Middleware based data replication providing snapshot isolation. In: SIGMOD (2005)
15.
Zurück zum Zitat Patiño-Martínez, M., Jiménez-Peris, R., Kemme, B., Alonso, G.: MIDDLE-R: consistent database replication at the middleware level. ACM TOCS 23(4), 375–423 (2005)CrossRef Patiño-Martínez, M., Jiménez-Peris, R., Kemme, B., Alonso, G.: MIDDLE-R: consistent database replication at the middleware level. ACM TOCS 23(4), 375–423 (2005)CrossRef
16.
Zurück zum Zitat Pavlo, A., Curino, C., Zdonik, S.B.: Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In: SIGMOD (2012) Pavlo, A., Curino, C., Zdonik, S.B.: Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In: SIGMOD (2012)
17.
Zurück zum Zitat Pothen, A., Simon, H.D., Liou, K.: Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl. 11(3), 430–452 (1990)MATHMathSciNetCrossRef Pothen, A., Simon, H.D., Liou, K.: Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl. 11(3), 430–452 (1990)MATHMathSciNetCrossRef
18.
Zurück zum Zitat Quamar, A., Kumar, K.A., Deshpande, A.: SWORD: scalable workload-aware data placement for transactional workloads. In: EDBT (2013) Quamar, A., Kumar, K.A., Deshpande, A.: SWORD: scalable workload-aware data placement for transactional workloads. In: EDBT (2013)
19.
Zurück zum Zitat Serrano, D., Patiño-Martínez, M., Jiménez-Peris, R., Kemme, B.: Boosting database replication scalability through partial replication and 1-copy-snapshot-isolation. In: Proceedings of the 13th PRDC (2007) Serrano, D., Patiño-Martínez, M., Jiménez-Peris, R., Kemme, B.: Boosting database replication scalability through partial replication and 1-copy-snapshot-isolation. In: Proceedings of the 13th PRDC (2007)
20.
Zurück zum Zitat Stonebraker, M.: The Case for Shared Nothing. IEEE Database Eng. Bull. 9(1), 4–9 (1986) Stonebraker, M.: The Case for Shared Nothing. IEEE Database Eng. Bull. 9(1), 4–9 (1986)
Metadaten
Titel
Optimizing Inter-data-center Large-Scale Database Parallel Replication with Workload-Driven Partitioning
verfasst von
Zhen Gao
Hong Min
Xiao Li
Jie Huang
Yi Jin
An Lei
Serge Bourbonnais
Miao Zheng
Gene Fuh
Copyright-Jahr
2016
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-49214-7_6

Neuer Inhalt