Abstract
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this article, we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.
- Abadi, D. J., Madden, S. R., and Ferreira, M. C. 2006. Integrating compression and execution in column-oriented database systems. Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York. Google ScholarDigital Library
- Ailamaki, A., DeWitt, D. J., Hill, M. D., and Skounakis, M. 2001. Weaving relations for cache performance. The VLDB J. 169--180. Google ScholarDigital Library
- Banga, G., Druschel, P., and Mogul, J. C. 1999. Resource containers: A new facility for resource management in server systems. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation. 45--58. Google ScholarDigital Library
- Baru, C. K., Fecteau, G., Goyal, A., Hsiao, H., Jhingran, A., Padmanabhan, S., Copeland, G. P., and Wilson, W. G. 1995. DB2 parallel edition. IBM Syst. J. 34, 2, 292--322. Google ScholarDigital Library
- Bavier, A., Bowman, M., Chun, B., Culler, D., Karlin, S., Peterson, L., Roscoe, T., Spalink, T., and Wawrzoniak, M. 2004. Operating system support for planetary-scale network services. In Proceedings of the 1st Symposium on Networked Systems Design and Implementation. 253--266. Google ScholarDigital Library
- Bentley, J. L. and McIlroy, M. D. 1999. Data compression using long common strings. In Data Compression Conference. 287--295. Google ScholarDigital Library
- Bloom, B. H. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7, 422--426. Google ScholarDigital Library
- Burrows, M. 2006. The Chubby lock service for loosely-coupled distributed systems. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation. 335--350. Google ScholarDigital Library
- Chandra, T., Griesemer, R., and Redstone, J. 2007. Paxos made live --- An engineering perspective. In Proceedings of PODC. Google ScholarDigital Library
- Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R. E. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation. 205--218. Google ScholarDigital Library
- Comer, D. 1979. Ubiquitous B-tree. Computing Surveys 11, 2 (June), 121--137. Google ScholarDigital Library
- Copeland, G. P., Alexander, W., Boughter, E. E., and Keller, T. W. 1988. Data placement in Bubba. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 99--108. Google ScholarDigital Library
- Dean, J. and Ghemawat, S. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation. 137--150. Google ScholarDigital Library
- DeWitt, D., Katz, R., Olken, F., Shapiro, L., Stonebraker, M., and Wood, D. 1984. Implementation techniques for main memory database systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 1--8. Google ScholarDigital Library
- DeWitt, D. J. and Gray, J. 1992. Parallel database systems: The future of high performance database systems. Commun. ACM 35, 6 (June), 85--98. Google ScholarDigital Library
- French, C. D. 1995. One size fits all database architectures do not work for DSS. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 449--450. Google ScholarDigital Library
- Gawlick, D. and Kinkade, D. 1985. Varieties of concurrency control in IMS/VS fast path. Datab. Eng. Bull. 8, 2, 3--10.Google Scholar
- Ghemawat, S., Gobioff, H., and Leung, S.-T. 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles. ACM, New York, 29--43. Google ScholarDigital Library
- Gray, J. 1978. Notes on database operating systems. In Operating Systems --- An Advanced Course. Lecture Notes in Computer Science, vol. 60. Springer-Verlag, ACM, New York. Google ScholarDigital Library
- Greer, R. 1999. Daytona and the fourth-generation language Cymbal. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 525--526. Google ScholarDigital Library
- Hagmann, R. 1987. Reimplementing the Cedar file system using logging and group commit. In Proceedings of the 11th Symposium on Operating Systems Principles. 155--162. Google ScholarDigital Library
- Hartman, J. H. and Ousterhout, J. K. 1993. The Zebra striped network file system. In Proceedings of the 14th Symposium on Operating Systems Principles. ACM, New York, 29--43. Google ScholarDigital Library
- kx.com. kx.com/products/database.php. Product page.Google Scholar
- Lamport, L. 1998. The part-time parliament. ACM Trans. Comput. Syst. 16, 2, 133--169. Google ScholarDigital Library
- MacCormick, J., Murphy, N., Najork, M., Thekkath, C. A., and Zhou, L. 2004. Boxwood: Abstractions as the foundation for storage infrastructure. In Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation. 105--120. Google ScholarDigital Library
- McCarthy, J. 1960. Recursive functions of symbolic expressions and their computation by machine. Commun. ACM 3, 4 (Apr.), 184--195. Google ScholarDigital Library
- O'Neil, P., Cheng, E., Gawlick, D., and O'Neil, E. 1996. The log-structured merge-tree (LSM-tree). Acta Inf. 33, 4, 351--385. Google ScholarDigital Library
- oracle.com. www.oracle.com/technology/products/database/clustering/index.html. Product page.Google Scholar
- Pike, R., Dorward, S., Griesemer, R., and Quinlan, S. 2005. Interpreting the data: Parallel analysis with Sawzall. Scientific Programming Journal 13, 4, 227--298. Google ScholarDigital Library
- Ratnasamy, S., Francis, P., Handley, M., Karp, R., and Shenker, S. 2001. A scalable content-addressable network. In Proceedings of SIGCOMM. ACM, New York, 161--172. Google ScholarDigital Library
- Rowstron, A. and Druschel, P. 2001. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proceedings of Middleware 2001. 329--350. Google ScholarDigital Library
- sensage.com. sensage.com/products-sensage.htm. Product page.Google Scholar
- Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. 2001. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of SIGCOMM. ACM, New York, 149--160. Google ScholarDigital Library
- Stonebraker, M. 1986. The case for shared nothing. Datab. Eng. Bull. 9, 1 (Mar.), 4--9.Google Scholar
- Stonebraker, M., Abadi, D. J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O'Neil, E., O'Neil, P., Rasin, A., Tran, N., and Zdonik, S. 2005. C-Store: A column-oriented DBMS. In Proceedings of the 10th International Conference on Very Large Data Bases. ACM, New York, 553--564. Google ScholarDigital Library
- Stonebraker, M., Aoki, P. M., Devine, R., Litwin, W., and Olson, M. A. 1994. Mariposa: A new architecture for distributed data. In Proceedings of the 10th International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA, 54--65. Google ScholarDigital Library
- sybase.com. www.sybase.com/products/databaseservers/sybaseiq. Product page.Google Scholar
- Zhao, B. Y., Kubiatowicz, J., and Joseph, A. D. 2001. Tapestry: An infrastructure for fault-tolerant wide-area location and routing. Tech. Rep. UCB/CSD-01-1141, CS Division, University of California, Berkeley. Apr. Google ScholarDigital Library
- Zukowski, M., Boncz, P. A., Nes, N., and Heman, S. 2005. MonetDB/X100 --- A DBMS in the CPU cache. IEEE Data Eng. Bull. 28, 2, 17--22.Google Scholar
Recommendations
GFS: Evolution on Fast-forward: A discussion between Kirk McKusick and Sean Quinlan about the origin and evolution of the Google File System
File SystemsDuring the early stages of development at Google, the initial thinking did not include plans for building a new file system. While work was still being done on one of the earliest versions of the company’s crawl and indexing system, however, it became ...
Scalable SQL and NoSQL data stores
In this paper, we examine a number of SQL and socalled "NoSQL" data stores designed to scale simple OLTP-style application loads over many servers. Originally motivated by Web 2.0 applications, these systems are designed to scale to thousands or ...
Benchmarking cloud serving systems with YCSB
SoCC '10: Proceedings of the 1st ACM symposium on Cloud computingWhile the use of MapReduce systems (such as Hadoop) for large scale data analysis has been widely recognized and studied, we have recently seen an explosion in the number of systems developed for cloud data serving. These newer systems address "cloud ...
Comments