Abstract
Cloud storage is an emerging infrastructure that offers Platforms as a Service (PaaS). On such platforms, storage and compute power are adjusted dynamically, and therefore it is important to build a highly scalable and reliable storage that can elastically scale on-demand with minimal startup cost.
In this paper, we propose ecStore -- an elastic cloud storage system that supports automated data partitioning and replication, load balancing, efficient range query, and transactional access. In ecStore, data objects are distributed and replicated in a cluster of commodity computer nodes located in the cloud. Users can access data via transactions which bundle read and write operations on multiple data items stored on possibly different cluster nodes.
The architecture of ecStore follows a stratum design that leverages an underlying distributed index with a replication layer in the middle and a transaction management layer on top. ecStore provides adaptive read consistency on replicated data. We also enhance the system with an effective load balancing scheme using a self-tuning replication technique that is specially designed for large-scale data. Furthermore, a multi-version optimistic concurrency control scheme matches well with the characteristics of data in cloud storages. To validate the performance of the system, we have conducted extensive experiments on various platforms including a commercial cloud (Amazon's EC2), an in-house cluster, and PlanetLab.
- http://www.comp.nus.edu.sg/~voht/TechRepVLDB10.pdf.Google Scholar
- epiC project. http://www.comp.nus.edu.sg/~epic.Google Scholar
- Google MegaStore's Presentation at SIGMOD 2008. http://perspectives.mvdirona.com/2008/07/10/GoogleMegastore.aspx.Google Scholar
- D. Abadi. Data management in the cloud: Limitation and Opportunities. http://sites.computer.org/debull/A09mar/abadi.pdf, 2009.Google Scholar
- A. Aboulnaga and S. Chaudhuri. Self-tuning histograms: building histograms without looking at data. In SIGMOD'99. Google ScholarDigital Library
- S. Agarwal, J. Dunagan, N. Jain, S. Saroiu, A. Wolman, and H. Bhogan. Volley: Automated data placement for geo-distributed cloud services. In NSDI, 2010. Google ScholarDigital Library
- R. Agrawal et al. The Claremont Report. http://db.cs.berkeley.edu/claremont/claremontreport08.pdf.Google Scholar
- M. K. Aguilera, W. Golab, and M. A. Shah. A practical scalable distributed b-tree. In VLDB, 2008. Google ScholarDigital Library
- S. Antony, D. Agrawal, and A. E. Abbadi. P2p systems with transactional semantics. In EDBT, 2008. Google ScholarDigital Library
- M. Brantner, D. Florescu, D. Graf, D. Kossmann, and T. Kraska. Building a database on s3. In SIGMOD, 2008. Google ScholarDigital Library
- M. J. Cahill, U. Röhm, and A. D. Fekete. Serializable isolation for snapshot databases. In SIGMOD, 2008. Google ScholarDigital Library
- B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. Pnuts: Yahoo!'s hosted data serving platform. In VLDB, 2008. Google ScholarDigital Library
- S. Das, D. Agrawal, and A. El Abbadi. G-store: a scalable data store for transactional multi key access in the cloud. In SOCC, 2010. Google ScholarDigital Library
- G. Decandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon's highly available key-value store. In SOSP, 2007. Google ScholarDigital Library
- D. J. DeWitt and J. Gray. Parallel database systems: The future of database processing or a passing fad? SIGMOD RECORD, 19:104--112, 1991. Google ScholarDigital Library
- P. Ganesan, M. Bawa, and H. Garcia-molina. Online balancing of range-partitioned data with applications to peer-to-peer systems. In VLDB, 2004. Google ScholarDigital Library
- S. Gilbert and N. Lynch. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News, 33(2):51--59, 2002. Google ScholarDigital Library
- J. Gray, P. Helland, P. O'Neil, and D. Shasha. The dangers of replication and a solution. In SIGMOD, 1996. Google ScholarDigital Library
- H. V. Jagadish, B. C. Ooi, and Q. H. Vu. Baton: a balanced tree structure for peer-to-peer networks. In VLDB, 2005. Google ScholarDigital Library
- T. Kraska, M. Hentschel, G. Alonso, and D. Kossmann. Consistency rationing in the cloud: Pay only when it matters. PVLDB, 2(1):253--264, 2009. Google ScholarDigital Library
- L. Lamport. Paxos made simple. SIGACT News, 2001.Google Scholar
- M. L. Lee, M. Kitsuregawa, B. C. Ooi, K.-L. Tan, and A. Mondal. Towards self-tuning data placement in parallel database systems. In SIGMOD, 2000. Google ScholarDigital Library
- D. B. Lomet and M. F. Mokbel. Locking key ranges with unbundled transaction services. In VLDB, 2009. Google ScholarDigital Library
- J. Maccormick, C. A. Thekkath, M. Jager, K. Roomp, L. Zhou, and R. Peterson. Niobe: A practical replication protocol. Trans. Storage, 3(4):1--43, 2008. Google ScholarDigital Library
- D. Pritchett. Base: An acid alternative. Queue, 6(3):48--55, 2008. Google ScholarDigital Library
Index Terms
- Towards elastic transactional cloud storage with range query support
Recommendations
Is your cloud elastic enough?: performance modelling the elasticity of infrastructure as a service (IaaS) cloud applications
ICPE '12: Proceedings of the 3rd ACM/SPEC International Conference on Performance EngineeringElasticity, the ability to rapidly scale resources up and down on demand, is an essential feature of public cloud platforms. However, it is difficult to understand the elasticity requirements of a given application and workload, and if the elasticity ...
Cloud Storage as the Infrastructure of Cloud Computing
ICICCI '10: Proceedings of the 2010 International Conference on Intelligent Computing and Cognitive InformaticsAs an emerging technology and business paradigm, Cloud Computing has taken commercial computing by storm. Cloud computing platforms provide easy access to a company’s high-performance computing and storage infrastructure through web services. With cloud ...
Towards Media Inter-cloud Standardization --- Evaluating Impact of Cloud Storage Heterogeneity
Digital media has been increasing very rapidly, resulting in cloud computing's popularity gain. Cloud computing provides ease of management of large amount of data and resources. With a lot of devices communicating over the Internet and with the rapidly ...
Comments