Abstract
The article proposes a scalable protocol for replication management in large-scale replicated systems. The protocol organizes sites and data replicas into a tree-structured, hierarchical cluster architecture. The basic idea of the protocol is to accomplish the complex task of updating replicated data with a very large number of replicas by a set of related but independently committed transactions. Each transaction is responsible for updating replicas in exactly one cluster and invoking additional transactions for member clusters. Primary copies (one from each cluster) are updated by a cross-cluster transaction. Then each cluster is independently updated by a separate transaction. This decoupled update propagation process results in possible multiple views of replicated data in a cluster. Compared to other replicated data management protocols, the proposed protocol has several unique advantages. First, thanks to a smaller number of replicas each transaction needs to atomically update in a cluster, the protocol significantly reduces the transaction abort rate, which tends to soar in large transactional systems. Second, the protocol improves user-level transaction response time as top-level update transactions are allowed to commit before all replicas have been updated. Third, read-only queries have the flexibility to see database views of different degrees of consistency and data currency. This ranges from global, most up to date, and consistent views, to local, consistent, but potentially old views, to local, nearest to users but potentially inconsistent views. Fourth, the protocol maintains its scalability by allowing dynamic system reconfiguration as it grows by splitting a cluster into two or more smaller ones. Fifth, autonomy of the clusters is preserved as no specific protocol is required to update replicas within the same cluster. Clusters are, therefore, free to use any valid replication or concurrency control protocols.
- ABBADI, A. E., SKEEN, D., AND CHRISTIAN, F. 1985. An efficient fault tolerant protocol for replicated data management. In Proceedings of the Fourth ACM Symposium on Principles of Database Systems (Portland, OR, March), 215-229. Google ScholarDigital Library
- ABBADI, A. E. AND TOUEG, S. 1989. Maintaining availability in partitioned replicated databases. ACM Trans. Database Syst. 14, 2 (June), 264-290. Google ScholarDigital Library
- ADLY, N. 1995. Performance evaluation of HARP: A hierarchical asynchronous replication protocol for large scale systems. Tech. Rep. TR-378 (August), Computer Laboratory, University of Cambridge.Google Scholar
- ADLY, N. AND KUMAR, A. 1994. HPP: A hierarchical propagation protocol for large scale replication in wide area networks. Tech. Rep. TR-331 (March), Computer Laboratory, University of Cambridge.Google Scholar
- ADLY, N., NAGI, M., AND BACON, g. 1993. A hierarchical asynchronous replication protocol for large scale systems. In Proceedings of the IEEE Workshop on Advances in Parallel and Distributed Systems (Princeton, NJ, Oct.), 152-157.Google ScholarCross Ref
- AGRAWAL, D. AND ABBADI, A.E. 1991. An efficient and fault-tolerant solution for distributed mutual exclusion. ACM Trans. Comput. Syst. 9, 1 (Feb.), 1-20. Google ScholarDigital Library
- AGRAWAL, D. AND ABBADI, A.E. 1992. Dynamic logical structures: A position statement for managing replicated data. In Proceedings of the Second Workshop on the Management of Replicated Data (Monterey, CA, Nov. 1992), 26-29.Google ScholarCross Ref
- ALSBERG, P. A. AND DAY, J. D. 1976. A principle for resilient sharing of distributed resources. In Proceedings of the Second International Conference on Data Engineering, (Los Angeles, CA, Feb. 1986), 562-570. Google ScholarDigital Library
- ANON ET AL. 1985. A measure of transaction processing power. Datamation 31, 7 (April), 112-118. Google ScholarDigital Library
- BARAK, A. AND KORNATZKY, Y. 1987. Design principles of operating systems for large scale multicomputers. Tech. Rep., IBM Research Division, T. J. Watson Research Center, Yorktown Heights, NY, RC 13220 (#59114).Google Scholar
- BARBARA, D., GARCIA-MOLINA, H., AND SPAUSTER, A. 1989. Increasing availability under mutual exclusion constraints with dynamic voting reassignment. ACM Trans. Comput. Syst. 7, 4 (Nov.), 394-426. Google ScholarDigital Library
- BERNSTEIN, P. AND GOODMAN, N. 1986. Serializability theory for replicated databases. J. Comput. Syst. Sci. 31, 3 (Dec.), 355-374. Google ScholarDigital Library
- BERNSTEIN, P., HADZILACOS, V., AND GOODMAN, N. 1987. Concurrency Control and Recovery in Database Systems. Addison-Wesley, Reading, MA. Google ScholarDigital Library
- BIRRELL, A. D., LEVIN, R., AND NEEDHAM, R.M. 1981. Grapevine: An exercise in distributed computing. In Proceedings of the Eighth Symposium on Operating Systems Principles (Pacific Grove, CA, Dec.), 178-179. Google ScholarDigital Library
- BLAZE, M. AND ALONSO, R. 1992. Dynamic hierarchical caching in large-scale distributed file systems. In Proceedings of the Twelfth International Distributed Computing Systems Conference (Yokohama, Japan, June 1992), 521-528.Google ScholarCross Ref
- CHEUNG, S., AMMAR, M., AND AHAMAD, A. 1990. The grid protocol: A high performance scheme for maintaining replicated data. In Proceedings of the IEEE Sixth International Conference on Data Engineering (Los Angeles, CA, Feb. 1990), 438-445. Google ScholarDigital Library
- CHUENG, S. Y., AMMAR, M. H., AND AHAMAD, A. 1991. Multidimensional voting. ACM Trans. Comput. Syst. 9, 4 (Nov.), 399-431. Google ScholarDigital Library
- DAVCEV, D. AND BURKHARD, W. 1985. Consistency and recovery control for replicated files. In Proceedings of the Tenth Symposium on Operating Systems Principles (Orcas Island, WA, Dec.), 87-96. Google ScholarDigital Library
- DAVIDSON, S. B., GARCIA-MOLINA, H., AND SKEEN, D. 1985. Consistency in partitioned networks. ACM Comput. Surv. 17, 3 (Sept.), 341-370. Google ScholarDigital Library
- GARCIA-MOLINA, H. AND WIEDERHOLD, G. 1982. Read-only transactions in a distributed database. ACM Trans. Database Syst. 7, 2 (June), 209-234. Google ScholarDigital Library
- GIFFORD, D. K. 1979. Weighted voting for replicated data. In Proceedings of the Seventh Symposium on Operating Systems Principles (Asilomar, CA, Dec.), 150-162. Google ScholarDigital Library
- GOLDING, R.A. 1993. Accessing replicated data in a large-scale distributed system. Ph.D. Thesis, Department of Computer Science, University of California, Santa Cruz.Google Scholar
- GOLDING, R. A. AND LONG, D.E. 1991. Accessing replicated data in a large-scale distributed system. Int. J. Comput. Simul. 1, 2 (Jan.), 347-372.Google Scholar
- GRAY, J., ED. 1991. The Benchmark Handbook for Database and Transaction Processing Systems. Morgan-Kaufmann, San Mateo, CA. Google ScholarDigital Library
- GRAY, J. AND REUTER, A. 1993. Transaction Processing: Concepts and Techniques. Morgan- Kaufmann, San Mateo, CA. Google ScholarDigital Library
- GuY, R. G., PAGE, T. W., HEIDEMANN, J. S., AND POPEK, G.J. 1990. Name transparency in very large scale distributed file systems. In Proceedings of the Second IEEE Workshop on Experimental Distributed Systems (Huntsville, AL, Oct. 1990), 20-25.Google Scholar
- HAERDER, T. AND REUTER, A. 1983. Principles of transaction-oriented database recovery. ACM Comput. Surv. 15, 4 (July), 287-317. Google ScholarDigital Library
- HELAL, A. 1991. Experimental analysis of replication in distributed systems. Ph.D. Thesis, Department of Computer Sciences, Purdue University. Google ScholarDigital Library
- HELAL, A., HEDDAYA, A., AND BHARGAVA, B. 1996. Replication Techniques in Distributed Systems. Kluwer Academic, Advances in Database Systems Book Series, Norwell, MA. Google ScholarDigital Library
- HERLIHY, M. 1986. A quorum consensus replication method for abstract data types. ACM Trans. Comput. Syst. 4, 1 (Feb.), 32-53. Google ScholarDigital Library
- JAJODIA, S. AND MUTCHLER, D. 1987. Dynamic voting. In Proceedings of the Sixteenth SIGMOD Conference (San Francisco, May). Google ScholarDigital Library
- KUMAR, A. 1990a. Hierarchical quorum consensus: A new class of algorithms for replicated data. Tech. Rep. (Aug.), Graduate School of Management, Cornell University.Google Scholar
- KUMAR, A. 1990b. Performance analysis of a hierarchical quorum consensus. In Proceedings of the Tenth International Conference on Distributed Computing Systems (Paris, May), 378-385.Google ScholarCross Ref
- KUMAR, A. AND CHUENG, S. 1991. A high availability V~ hierarchical grid algorithm for replicated data. Inf. Proc. Lett. 40, 311-316. Google ScholarDigital Library
- LIU, X. 1995. Data communication and replication strategies for large scale distributed databases. Ph.D. Thesis, Department of Computer Sciences, Purdue University. Google ScholarDigital Library
- LIU, X., CHENG, L., BHARGAVA, B., AND ZHAO, Z. 1995a. Experimental study of data communication for scalability in distributed databases. Tech. Rep. CSD-TR-95-046 (July), Department of Computer Sciences, Purdue University.Google Scholar
- LIU, X., PITOURA, E., AND BHARGAVA, B. 1995b. Adapting distributed database systems for high availability. Tech. Rep. CSD-TR-95-013 (Feb.), Department of Computer Sciences, Purdue University.Google Scholar
- MAEKAWA, M. 1985. A V~ algorithm for mutual exclusion in decentralized systems. ACM Trans. Comput. Syst. 3, 2 (May), 145-159. Google ScholarDigital Library
- MINOURA, T. AND WIEDERHOLD, G. 1982. Resilient extended true-copy token scheme for a distributed database system. IEEE Trans. Softw. Eng. 9, 5 (May), 173-189.Google Scholar
- OPPEN, D. C. AND DALAL, Y.K. 1981. The clearinghouse: A decentralized agent for locating named objects in a distributed environment. Tech. Rep. OPD-T8103, Xerox Research Center, Palo Alto, CA.Google Scholar
- SANDHU, H. S. AND ZHOU, S. 1992. Cluster-based file replication in large-scale distributed systems. In Proceedings of the ACM SIGMETRICS & Performance Conference (Rhode Island, June), 91-102. Google ScholarDigital Library
- SATYANARAYANAN, M. 1990. Scalable, secure, and highly available distributed file access. IEEE Comput. 23, 5 (May), 9-21. Google ScholarDigital Library
- SATYANARAYANAN, M., KISTLER, J. J., KUMAR, P., OKASAKI, M. E., SIEGEL, E. H., AND STEERE, D.C. 1990. Coda: A highly available file system for a distributed workstation environment. IEEE Trans. Comput. 39, 4 (April), 447-459. Google ScholarDigital Library
- STONEBRAKER, M. 1979. Concurrency control and consistency of multiple copies of data in distributed INGRES. IEEE Trans. Softw. Eng. SE-5, 3 (May), 188-194.Google ScholarDigital Library
- TRANSACTION PROCESSING PERFORMANCE COUNCIL (TPC). 1992. TPC Benchmark C (Rev. 1.0 ed.). Transaction Processing Performance Council (TPC), San Jose, CA.Google Scholar
- ZHANG, Y. 1994. Communication experiments for distributed transaction processing--from LAN to WAN. Ph.D. Thesis, Department of Computer Sciences, Purdue University. Google ScholarDigital Library
- ZHANG, Y. AND BHARGAVA, B. 1993. WANCE: A wide area network communication emulation system. In Proceedings of the IEEE Workshop on Advances in Parallel and Distributed Systems (Princeton, NJ, Oct.), 40-45.Google ScholarCross Ref
Recommendations
Ensuring consistency in multidatabases by preserving two-level serializability
The concept of serializability has been the traditionally accepted correctness criterion in database systems. However in multidatabase systems (MDBSs), ensuring global serializability is a difficult task. The difficulty arises due to the heterogeneity ...
An efficient method for checking object-oriented database schema correctness
Inheritance is introducted in object-oriented systems to enhance code reuse and create more compact and readable software. Powerful object models adopt multiple inheritance, allowing a type (or class) definition to inherit from more than one supertype. ...
Optimization techniques for queries with expensive methods
Object-relational database management systems allow knowledgeable users to define new data types as well as new methods (operators) for the types. This flexibility produces an attendant complexity, which must be handled in new ways for an object-...
Comments