ABSTRACT
We present a memory-aware load balancing (MALB) technique to dispatch transactions to replicas in a replicated database. Our MALB algorithm exploits knowledge of the working sets of transactions to assign them to replicas in such a way that they execute in main memory, thereby reducing disk I/O. In support of MALB, we introduce a method to estimate the size and the contents of transaction working sets. We also present an optimization called update filtering that reduces the overhead of update propagation between replicas.
We show that MALB greatly improves performance over other load balancing techniques -- such as round robin, least connections, and locality-aware request distribution (LARD) -- that do not use explicit information on how transactions use memory. In particular, LARD demonstrates good performance for read-only static content Web workloads, but it gives performance inferior to MALB for database workloads as it does not efficiently handle large requests. MALB combined with update filtering further boosts performance over LARD.
We build a prototype replicated system, called Tashkent+, with which we demonstrate that MALB and update filtering techniques improve performance of the TPC-W and RUBiS benchmarks. In particular, in a 16-replica cluster and using the ordering mix of TPC-W, MALB doubles the throughput over least connections and improves throughput 52% over LARD. MALB with update filtering further improves throughput to triple that of least connections and more than double that of LARD. Our techniques exhibit super-linear speedup; the throughput of the 16-replica cluster is 37 times the peak throughput of a standalone database due to better use of the cluster's memory.
- {ACZ03-1} C. Amza, A. Cox, and W. Zwaenepoel. Conflict-Aware Scheduling for Dynamic Content Applications. In Proceedings of the Fifth USENIX Symposium on Internet Technologies and Systems, March 2003. Google ScholarDigital Library
- {ACZ03-2} C. Amza, A. Cox, and W. Zwaenepoel. Distributed Versioning: Consistent Replication for Scaling Back-end Databases of Dynamic Content Web Sites. In ACM/IFIP/Usenix International Middleware Conference, June 2003. Google ScholarDigital Library
- {ACC+02} C. Amza, E. Cecchet, A. Chanda, Alan L. Cox, S. Elnikety, R. Gil, J. Marguerite, K. Rajamani and W. Zwaenepoel. Specification and Implementation of Dynamic Web Site Benchmarks. WWC-5: The 5th Annual IEEE Workshop on Workload Characterization, Austin, Texas, USA, November 2002.Google Scholar
- {BB97} A. Barak and A. Braverman. Memory Ushering in a Scalable Computing Cluster, In Proceedings of the IEEE Third International Conference on Algorithms and Architecture for Parallel Processing, Melbourne, 1997.Google ScholarCross Ref
- {BBG+95} H. Berenson, P. Bernstein, J. Gray, J. Melton, E. O'Neil, and P. O'Neil. A Critique of ANSI SQL Isolation Levels. In Proceedings of the SIGMOD International Conference on Management of Data, May 1995. Google ScholarDigital Library
- {BCD+06} S. Bouchenak, A. Cox, S. Dropsho, S. Mittal, and W. Zwaenepoel. Caching Dynamic Web Content: Designing and Analysing an Aspect-oriented Solution. In Proceedings of Middleware 2006, Melbourne, Australia, November 2006. Google ScholarDigital Library
- {BHG87} P. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987. Google ScholarDigital Library
- {CP00} L. Cherkasova and S. Ponnekanti. Optimizing a "Content-Aware" Load Balancing Strategy for Shared Web Hosting Service. In the Proceedings of the 8th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), San Francisco, CA, USA, August 2000. Google ScholarDigital Library
- {CS04} S. Kyun Cha and C. Song. P*TIME: Highly Scalable OLTP DBMS for Managing Update-Intensive Stream Workload. In Proceedings of 30nd International Conference on Very Large Data Bases (VLDB 2004), Toronto, Canada, August 2004. Google ScholarDigital Library
- {DS04} K. Daudjee and K. Salem. Lazy Database Replication with Ordering Guarantees. In Proceedings of the 20th International Conference on Data Engineering (ICDE 2004), Boston, MA, USA, March 2004. Google ScholarDigital Library
- {DS06} K. Daudjee and K. Salem. Lazy Database Replication with Snapshot Isolation. In Proceedings of 32nd International Conference on Very Large Data Bases (VLDB 2006), Seoul, Korea, September 2006. Google ScholarDigital Library
- {EDP06} S. Elnikety, S. Dropsho, and F. Pedone. Tashkent: Uniting Durability with Transaction Ordering for High-Performance Scalable Database Replication. In the Proceedings of EuroSys, Leuven, Belgium, April 2006. Google ScholarDigital Library
- {ENTZ04} S. Elnikety, E. Nahum, J. Tracey, and W. Zwaenepoel. A Method for Transparent Admission Control and Request Scheduling in E-Commerce Web Sites. In the Proceedings of the 13th International World Wide Web Conference (WWW2004), New York, NY, USA, May 2004. Google ScholarDigital Library
- {EPZ05} S. Elnikety, F. Pedone, and W. Zwaenepoel, Database Replication Using Generalized Snapshot Isolation. IEEE Symposium on Reliable Distributed Systems (SRDS 2005), Orlando, Florida, October 2005. Google ScholarDigital Library
- {F05} A. Fekete. Allocating Isolation Levels to Transactions. ACM Sigmod, Baltimore, Maryland, June 2005. Google ScholarDigital Library
- {F1-99} A. Fekete. Serialisability and Snapshot Isolation. In Proceedings of the Australian Database Conference, pages 201--210, Auckland, New Zealand, January 1999.Google Scholar
- {F2-99} L. Frank. Evaluation of the Basic Remote Backup and Replication Methods for High Availability Databases. Software Practice and Experience, 29:1339--1353, 1999. Google ScholarDigital Library
- {FLO+96} A. Fekete, D. Liarokapis, E. O'Neil, P. O'Neil, and D. Shasha. Making Snapshot Isolation Serializable. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 173--182, June 1996.Google Scholar
- {GD03} L. Gao, M. Dahlin, A. Nayate, J. Zheng, and A. Iyengar. Application Specific Data Replication for Edge Services. In Proceedings of the Twelfth International Conference on the World Wide Web, pages 449--460. ACM Press, 2003. Google ScholarDigital Library
- {GHOD96} J. Gray, P. Helland, P. O'Neil, and D. Shasha. The Dangers of Replication and a Solution. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal (Canada), June 1996. Google ScholarDigital Library
- {HCG+06} T. Heath, A. P. Centeno, P. George, L. Ramos, Y. Jaluria, R. Bianchini. Mercury and Freon: Temperature Emulation and Management for Server Systems. In Proceedings of Architectural Support for Programming Languages and Operating Systems (ASPLOS'06), San Jose, CA, October, 2006. Google ScholarDigital Library
- {J95} K. Jacobs. Concurrency Control, Transaction Isolation and Serializability in SQL92 and Oracle7. Technical report number A33745, Oracle Corporation, Redwoord City, CA, July 1995.Google Scholar
- {KA00} B. Kemme and G. Alonso. Don't be Lazy, Be Consistent: Postgres-R, a New Way to Implement Database Replication. In Proceedings of 26th International Conference on Very Large Data Bases (VLDB 2000), Cairo, Egypt, September 2000. Google ScholarDigital Library
- {KA98} B. Kemme and G. Alonso. A Suite of Database Replication Protocols Based on Group Communication Primitives. In Proceedings 18th International Conference on Distributed Computing Systems (ICDCS), Amsterdam, The Netherlands, May 1998. Google ScholarDigital Library
- {KS00} M. Karlsson, and P. Stenström. An Analytical Model of the Working-Set Sizes in Decision-Support Systems. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '00). Santa Clara, CA, USA, June 2000. Google ScholarDigital Library
- {L98} L. Lamport. The Part-time Parliament. ACM Transactions on Computer Systems, 16(2):133--169, May 1998. Google ScholarDigital Library
- {L99} A. Lodi. Algorithms for Two-Dimensional Bin Packing and Assignment Problems. PhD thesis. Uiversita Degli Studi di Bologna. 1999.Google Scholar
- {LKPJ05} Y. Lin, B. Kemme, M. Patiño-Martínez, R. Jiménez-Peris. Middleware Based Data Replication Providing Snapshot Isolation. ACM International Conference on Management of Data (SIGMOD), Baltimore, Maryland, June 2005. Google ScholarDigital Library
- {OW} ObjectWeb Consortium. Rice University bidding system. http://rubis.objectweb.org/.Google Scholar
- {Ora97} Data Concurrency and Consistency, Oracle8 Concepts, Release 8.0: Chapter 23. Technical report, Oracle Corporation, 1997.Google Scholar
- {P86} C. Papadimitriou. The Theory of Database Concurrency Control. Computer Science Press, 1986. Google ScholarDigital Library
- {PA04} C. Plattner and G. Alonso. Ganymed: Scalable Replication for Transactional Web Applications. In Proceedings of the 5th ACM/IFIP/USENIX International Middleware Conference, Toronto, Canada, October 2004. Google ScholarDigital Library
- {PAB+98} V. S. Pai, M. Aron, G. Banga, M. Svendsen, P. Druschel, W. Zwaenepoel, and E. Nahum. Locality-Aware Request Distribution in Cluster-based Network Servers. In Proceedings of the 8th ACM Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, October 1998. Google ScholarDigital Library
- {PG} PostgreSQL, SQL Compliant, Open Source Object-Relational Database Management System. http://www.postgresql.org/.Google Scholar
- {PJKA05} M. Patiño-Martínez, R. Jiménez-Peris, B. Kemme, and G. Alonso. Consistent Database Replication at the Middleware Level. ACM Transactions on Computer Systems (TOCS). Volume 23, No. 4, 2005, pp 1--49. Google ScholarDigital Library
- {PL91} C. Pu and A. Leff. Replica Control in Distributed Systems: An Asynchronous Approach. SIGMOD Record (ACM Special Interest Group on Management of Data), 20(2):377--386, June 1991. Google ScholarDigital Library
- {RS04} R. van Renesse, and F. B. Schneider. Chain Replication for Supporting High Throughput and Availability. Sixth Symposium on Operating Systems Design and Implementation (OSDI '04). USENIX Association, (San Francisco, California, December 2004). Google ScholarDigital Library
- {RT04} M. Ronström and L. Thalmann. MySQL Cluster Architecture Overview. MySQL Technical White Paper, 2004.Google Scholar
- {Sch90} F. B. Schneider. Implementing fault-tolerant services using the state machine approach: a tutorial. In ACM Computing Surveys. 22 (4):299--319, December 1990. Google ScholarDigital Library
- {TPC} Transaction Processing Performance Council -- http://www.tpc.org/.Google Scholar
- {WK05} S. Wu and B. Kemme. Postgres-R(SI): Combining Replica Control with Concurrency Control based on Snapshot Isolation. In Proceedings of International Conference on Data Engineering (ICDE), April 2005. Google ScholarDigital Library
- {WPS+00} M. Wiesmann, F. Pedone, A. Schiper, B. Kemme, and G. Alonso. Understanding replication in databases and distributed systems. In Proceedings of 20th International Conference on Distributed Computing Systems (ICDCS'2000), Taipei, Taiwan, April 2000. Google ScholarDigital Library
- {ZBCS99} X. Zhang, M. Barrientos, J. B. Chen, and M. Seltzer. HACC: An Architecture for Cluster-Based Web Servers. In Proceedings of the 3rd USENIX Windows NT Symposium, Seattle, WA, July 1999. Google ScholarDigital Library
- {ZP06} V. Zuikeviciute and F. Pedone. Conflict-Aware Load-Balancing Techniques for Database Replication. USI Technical Report, 2006.Google Scholar
Index Terms
- Tashkent+: memory-aware load balancing and update filtering in replicated databases
Recommendations
Tashkent+: memory-aware load balancing and update filtering in replicated databases
EuroSys'07 Conference ProceedingsWe present a memory-aware load balancing (MALB) technique to dispatch transactions to replicas in a replicated database. Our MALB algorithm exploits knowledge of the working sets of transactions to assign them to replicas in such a way that they execute ...
Tashkent: uniting durability with transaction ordering for high-performance scalable database replication
EuroSys '06: Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006In stand-alone databases, the functions of ordering the transaction commits and making the effects of transactions durable are performed in one single action, namely the writing of the commit record to disk. For efficiency many of these writes are ...
Tashkent: uniting durability with transaction ordering for high-performance scalable database replication
Proceedings of the 2006 EuroSys conferenceIn stand-alone databases, the functions of ordering the transaction commits and making the effects of transactions durable are performed in one single action, namely the writing of the commit record to disk. For efficiency many of these writes are ...
Comments