Skip to main content
Top

2016 | OriginalPaper | Chapter

Optimizing Inter-data-center Large-Scale Database Parallel Replication with Workload-Driven Partitioning

Authors : Zhen Gao, Hong Min, Xiao Li, Jie Huang, Yi Jin, An Lei, Serge Bourbonnais, Miao Zheng, Gene Fuh

Published in: Transactions on Large-Scale Data- and Knowledge-Centered Systems XXIV

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Geographically distributed data centers are deployed for non-stop business operations by many enterprises. In case of disastrous events, ongoing workloads must be failed over from the current data center to another active one within just a few seconds to achieve continuous service availability. Software-based parallel database replication techniques are designed to meet very high throughput with near-real-time latency. Understanding workload characteristics is one of the key factors for improving replication performance. In this paper, we propose a workload-driven method to optimize database replication latency and minimize transaction splits with a minimum of parallel replication consistency groups. Our two-phased approach includes (1) a log-based mechanism for workload pattern discovery; (2) a history-based algorithm on pattern analysis, database partitioning and partition adjustment. The experimental results from a real banking batch workload and a benchmark OLTP workload demonstrate the effectiveness of the solution even for partitioning 1000 s of database tables in very large workloads. Finally, the algorithm to automate the cyclic flow of workload profile capturing and partitioning readjustment is developed and verified.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Cecchet, E., Candea, G., Ailamaki, A.: Middleware-based database replication: the gaps between theory and practice. In: SIGMOD (2008) Cecchet, E., Candea, G., Ailamaki, A.: Middleware-based database replication: the gaps between theory and practice. In: SIGMOD (2008)
2.
go back to reference Codd, E.F.: The Relational Model for Database Management, Version 2. Addison-Wesley, New York (1990). ISBN: 9780201141924 MATH Codd, E.F.: The Relational Model for Database Management, Version 2. Addison-Wesley, New York (1990). ISBN: 9780201141924 MATH
3.
go back to reference Corbett, J.C., et al.: Spanner: Google’s globally-distributed database. In: OSDI (2012) Corbett, J.C., et al.: Spanner: Google’s globally-distributed database. In: OSDI (2012)
4.
go back to reference Curino, C., Jones, E., Zhang, Y., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. Proc. VLDB 3, 48–57 (2010)CrossRef Curino, C., Jones, E., Zhang, Y., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. Proc. VLDB 3, 48–57 (2010)CrossRef
5.
go back to reference DeCusatis, C.: Handbook of Fiber Optic Data Communication: A Practical Guide to Optical Networking, 4th edn. Academic Press, London (2013). ISBN: 10 0124016731 DeCusatis, C.: Handbook of Fiber Optic Data Communication: A Practical Guide to Optical Networking, 4th edn. Academic Press, London (2013). ISBN: 10 0124016731
6.
go back to reference Fiduccia, C.M., Mattheyses, R.M.: A linear-time heuristic for improving network partitions. In: Proceedings of the 19th Design Automation Conference, pp. 175–181, January 1982 Fiduccia, C.M., Mattheyses, R.M.: A linear-time heuristic for improving network partitions. In: Proceedings of the 19th Design Automation Conference, pp. 175–181, January 1982
7.
go back to reference Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1990) Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1990)
8.
go back to reference Graham, R.L.: Bounds on multiprocessing anomalies and related packing algorithms. In: AFIPS Spring Joint Computing Conference, pp. 205–217 (1972) Graham, R.L.: Bounds on multiprocessing anomalies and related packing algorithms. In: AFIPS Spring Joint Computing Conference, pp. 205–217 (1972)
9.
go back to reference Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: SIGMOD (1996) Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: SIGMOD (1996)
10.
go back to reference Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)MathSciNetCrossRef Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)MathSciNetCrossRef
11.
go back to reference Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing (1998) Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing (1998)
12.
go back to reference Kemme, B., Jiménez-Peris, R., Patiño-Martínez, M.: Database replication. Synth. Lect. Data Manag. 5, 1–153 (2010). Morgan & Claypool PublishersCrossRef Kemme, B., Jiménez-Peris, R., Patiño-Martínez, M.: Database replication. Synth. Lect. Data Manag. 5, 1–153 (2010). Morgan & Claypool PublishersCrossRef
13.
go back to reference Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Techn. J. 49, 291–307 (1970)MATHCrossRef Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Techn. J. 49, 291–307 (1970)MATHCrossRef
14.
go back to reference Lin, Y., Kemme, B., Patiño-Martínez, M., Jiménez-Peris, R.: Middleware based data replication providing snapshot isolation. In: SIGMOD (2005) Lin, Y., Kemme, B., Patiño-Martínez, M., Jiménez-Peris, R.: Middleware based data replication providing snapshot isolation. In: SIGMOD (2005)
15.
go back to reference Patiño-Martínez, M., Jiménez-Peris, R., Kemme, B., Alonso, G.: MIDDLE-R: consistent database replication at the middleware level. ACM TOCS 23(4), 375–423 (2005)CrossRef Patiño-Martínez, M., Jiménez-Peris, R., Kemme, B., Alonso, G.: MIDDLE-R: consistent database replication at the middleware level. ACM TOCS 23(4), 375–423 (2005)CrossRef
16.
go back to reference Pavlo, A., Curino, C., Zdonik, S.B.: Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In: SIGMOD (2012) Pavlo, A., Curino, C., Zdonik, S.B.: Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In: SIGMOD (2012)
17.
go back to reference Pothen, A., Simon, H.D., Liou, K.: Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl. 11(3), 430–452 (1990)MATHMathSciNetCrossRef Pothen, A., Simon, H.D., Liou, K.: Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl. 11(3), 430–452 (1990)MATHMathSciNetCrossRef
18.
go back to reference Quamar, A., Kumar, K.A., Deshpande, A.: SWORD: scalable workload-aware data placement for transactional workloads. In: EDBT (2013) Quamar, A., Kumar, K.A., Deshpande, A.: SWORD: scalable workload-aware data placement for transactional workloads. In: EDBT (2013)
19.
go back to reference Serrano, D., Patiño-Martínez, M., Jiménez-Peris, R., Kemme, B.: Boosting database replication scalability through partial replication and 1-copy-snapshot-isolation. In: Proceedings of the 13th PRDC (2007) Serrano, D., Patiño-Martínez, M., Jiménez-Peris, R., Kemme, B.: Boosting database replication scalability through partial replication and 1-copy-snapshot-isolation. In: Proceedings of the 13th PRDC (2007)
20.
go back to reference Stonebraker, M.: The Case for Shared Nothing. IEEE Database Eng. Bull. 9(1), 4–9 (1986) Stonebraker, M.: The Case for Shared Nothing. IEEE Database Eng. Bull. 9(1), 4–9 (1986)
Metadata
Title
Optimizing Inter-data-center Large-Scale Database Parallel Replication with Workload-Driven Partitioning
Authors
Zhen Gao
Hong Min
Xiao Li
Jie Huang
Yi Jin
An Lei
Serge Bourbonnais
Miao Zheng
Gene Fuh
Copyright Year
2016
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-49214-7_6