ABSTRACT
In stand-alone databases, the functions of ordering the transaction commits and making the effects of transactions durable are performed in one single action, namely the writing of the commit record to disk. For efficiency many of these writes are grouped into a single disk operation. In replicated databases in which all replicas agree on the commit order of update transactions, these two functions are typically separated. Specifically, the replication middleware determines the global commit order, while the database replicas make the transactions durable.The contribution of this paper is to demonstrate that this separation causes a significant scalability bottleneck. It forces some of the commit records to be written to disk serially, where in a standalone system they could have been grouped together in a single disk write. Two solutions are possible: (1) move durability from the database to the replication middleware, or (2) keep durability in the database and pass the global commit order from the replication middleware to the database.We implement these two solutions. Tashkent-MW is a pure middleware solution that combines durability and ordering in the middleware, and treats an unmodified database as a black box. In Tashkent-API, we modify the database API so that the middleware can specify the commit order to the database, thus, combining ordering and durability inside the database. We compare both Tashkent systems to an otherwise identical replicated system, called Base, in which ordering and durability remain separated. Under high update transaction loads both Tashkent systems greatly outperform Base in throughput and response time.
- Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'Neil, and Patrick O'Neil. A critique of ANSI SQL isolation levels. In proceedings of the SIGMOD International Conference on Management of Data, May 1995. Google ScholarDigital Library
- Philip Bernstein, Vassos Hadzilacos, and Nathan Goodman. Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987. Google ScholarDigital Library
- Sameh Elnikety, Fernando Pedone, and Willy Zwaenepoel. Database Replication Using Generalized Snapshot Isolation. IEEE Symposium on Reliable Distributed Systems (SRDS 2005), Orlando, Florida, Oct. 2005. Google ScholarDigital Library
- Alan Fekete. Allocating Isolation Levels to Transactions. ACM Sigmod, Baltimore, Maryland, June 2005. Google ScholarDigital Library
- Alan Fekete. Serialisability and snapshot isolation. In proceedings of the Australian Database Conference, pages 201--210, Auckland, New Zealand, January 1999.Google Scholar
- Lars Frank. Evaluation of the basic remote backup and replication methods for high availability databases. Software Practice and Experience, 29:1339--1353, 1999. Google ScholarDigital Library
- Alan Fekete, Dimitrios Liarokapis, Elizabeth O'Neil, Patrick O'Neil, and Dennis Shasha. Making snapshot isolation serializable. In proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 173--182, June 1996.Google Scholar
- Lei Gao, Mike Dahlin, Amol Nayate, Jiandan Zheng, and Arun Iyengar. Application specific data replication for edge services. In Proceedings of the twelfth international conference on World Wide. Web, pages 449--460. ACM Press, 2003. Google ScholarDigital Library
- Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. The dangers of replication and a solution. In proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Canada, June 1996. Google ScholarDigital Library
- K. Jacobs. Concurrency control, transaction isolation and serializability in SQL92 and Oracle7. Technical report number A33745, Oracle Corporation, Redwood City, CA, July 1995.Google Scholar
- Bettina Kemme and Gustavo Alonso. Don't be lazy, be consistent: Postgres-R, a new way to implement database replication. In proceedings of 26th International Conference on Very Large Data Bases (VLDB 2000), Cairo, Egypt, September 2000. Google ScholarDigital Library
- Bettina Kemme and Gustavo Alonso. A suite of database replication protocols based on group communication primitives. In proceedings 18th International Conference on Distributed Computing Systems (ICDCS), Amsterdam, The Netherlands, May 1998. Google ScholarDigital Library
- Leslie Lamport. The Part-time Parliament. ACM Transactions on Computer Systems, 16(2):133--169, May 1998. Google ScholarDigital Library
- Yi Lin, Bettina Kemme, Marta Patifio-Martínez, and Ricardo Jiménez-Peris. Middleware based Data Replication providing Snapshot Isolation. ACM Int. Conf. on Management of Data (SIGMOD), Baltimore, Maryland, June 2005. Google ScholarDigital Library
- Oracle parallel server for windows NT clusters. Online White Paper.Google Scholar
- Data Concurrency and Consistency, Oracle8 Concepts, Release 8.0: Chapter 23. Technical report, Oracle Corporation, 1997.Google Scholar
- Christos Papadimitriou. The theory of database concurrency control. Computer Science Press. July 1986. Google ScholarDigital Library
- Christian Plattner and Gustavo Alonso. Ganymed: Scalable Replication for Transactional Web Applications. In proceedings of the 5th ACM/IFIP/USENIX International Middleware Conference, Toronto, Canada, October 2004. Google ScholarDigital Library
- PostgreSQL, SQL compliant, open source object-relational database management system. http://www.postgresql.org/.Google Scholar
- Calton Pu and Avraham Leff. Replica control in distributed systems: an asynchronous approach. SIGMOD Record (ACM Special Interest Group on Management of Data), 20(2): 377--386, June 1991. Google ScholarDigital Library
- Robbert van Renesse and Fred B. Schneider. Chain Replication for Supporting High Throughput and Availability. Sixth Symposium on Operating Systems Design and Implementation (OSDI '04), San Francisco, California, December 2004. Google ScholarDigital Library
- Fred B. Schneider. Implementing fault-tolerant services using the state machine approach: a tutorial. In ACM Computing Surveys. 22 (4):299--319, December 1990. Google ScholarDigital Library
- Transaction Processing Performance Council - http://www.tpc.org/.Google Scholar
- Shuqing Wu and Bettina Kemme. Postgres-R(SI): Combining Replica Control with Concurrency Control based on Snapshot Isolation. In proceedings of International Conference on Data Engineering (ICDE), April 2005. Google ScholarDigital Library
- Matthias Wiesmann, Fernando Pedone, André Schiper, Bettina Kemme, and Gustavo Alonso. Understanding replication in databases and distributed systems. In proceedings of 20th International Conference on Distributed Computing Systems (ICDCS'2000), Taipei, Taiwan, April 2000. Google ScholarDigital Library
Index Terms
- Tashkent: uniting durability with transaction ordering for high-performance scalable database replication
Recommendations
Tashkent: uniting durability with transaction ordering for high-performance scalable database replication
Proceedings of the 2006 EuroSys conferenceIn stand-alone databases, the functions of ordering the transaction commits and making the effects of transactions durable are performed in one single action, namely the writing of the commit record to disk. For efficiency many of these writes are ...
SIPRe: a partial database replication protocol with SI replicas
SAC '08: Proceedings of the 2008 ACM symposium on Applied computingDatabase replication has been researched as a solution to overcome the problems of performance and availability of distributed systems. Full database replication, based on group communication systems, is an attempt to enhance performance that works well ...
Tashkent+: memory-aware load balancing and update filtering in replicated databases
EuroSys'07 Conference ProceedingsWe present a memory-aware load balancing (MALB) technique to dispatch transactions to replicas in a replicated database. Our MALB algorithm exploits knowledge of the working sets of transactions to assign them to replicas in such a way that they execute ...
Comments