ABSTRACT
Selecting appropriate indices and materialized views is critical for high performance in relational databases. By example, we show that the problem of schema optimization is also highly relevant for NoSQL databases. We explore the problem of schema design in NoSQL databases with a goal of optimizing query performance while minimizing storage overhead. Our suggested approach uses the cost of executing a given workload for a given schema to guide the mapping from the application data model to a physical schema. We propose a cost-driven approach for optimization and discuss its usefulness as part of an automated schema design tool.
- HBase: A Distributed Database for Large Datasets. Retrieved March 7, 2013 from http://hbase.apache.org.Google Scholar
- S. Agrawal, S. Chaudhuri, and V. R. Narasayya. Automated Selection of Materialized Views and Indexes in SQL Databases. In VLDB '00, pages 496--505, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
- Benoit Dageville, D. Das, K. Dias, K. Yagoub, and M. Zait. Automatic SQL tuning in oracle 10g. VLDB '04, 30:1098--1109, 2004. Google ScholarDigital Library
- V. Benzaken, G. Castagna, K. Nguyen, and J. Siméon. Static and dynamic semantics of NoSQL languages. In POPL '13, pages 101--114, New York, New York, USA, 2013. ACM Press. Google ScholarDigital Library
- K. S. Beyer, V. Ercegovac, R. Gemulla, A. Balmin, M. Y. Eltabakh, C.-C. Kanne, F. Özcan, and E. J. Shekita. Jaql: A Scripting Language for Large Scale Semistructured Data Analysis. PVLDB, 4(12):1272--1283, 2011.Google ScholarDigital Library
- A. Calil and S. Mello. SimpleSQL : A Relational Layer for SimpleDB. In Advances in Databases and Information Systems, pages 99--110. 2012. Google ScholarDigital Library
- R. Cattell. Scalable SQL and NoSQL data stores. ACM SIGMOD Record, 39(4):12--27, May 2011. Google ScholarDigital Library
- E. Hewitt. Cassandra: The Definitive Guide. O'Reilly Media, Sebastopol, CA, 2 edition, 2011.Google Scholar
- A. Lakshman and P. Malik. Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review, 44(2):35, Apr. 2010. Google ScholarDigital Library
- A. Lamb, M. Fuller, R. Varadarajan, N. Tran, B. Vandiver, L. Doshi, and C. Bear. The Vertica Analytic Database : C-Store 7 Years Later. In VLDB '12, volume 5, pages 1790--1801, 2012. Google ScholarDigital Library
- A. Rasin and S. Zdonik. An Automatic Physical Design Tool for Clustered Column-Stores. In EDBT '13, pages 203--214, 2013. Google ScholarDigital Library
- G. L. Sanders and S. Shin. Denormalization effects on performance of RDBMS. In Proceedings of the 34th Annual Hawaii International Conference on System Sciences. IEEE Comput. Soc, 2001. Google ScholarDigital Library
- S. Scherzinger, E. C. De Almeida, F. Ickert, and M. D. Del Fabro. On the necessity of model checking NoSQL database schemas when building SaaS applications. Proceedings of the 2013 International Workshop on Testing the Cloud - TTC 2013, 2013. Google ScholarDigital Library
- M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O. Neil, P. O. Neil, A. Rasin, N. Tran, and S. Zdonik. C-Store : A Column-oriented DBMS. In VLDB '05, pages 553--564, 2005. Google ScholarDigital Library
- O. G. Tsatalos, M. H. Solomon, and Y. E. Ioannidis. The GMAP: a versatile tool for physical data independence. The VLDB Journal The International Journal on Very Large Data Bases, 5(2):101--118, Apr. 1996. Google ScholarDigital Library
- T. Vajk, L. Deák, K. Fekete, and G. Mezei. Automatic NoSQL Schema Development: A Case Study. In Artificial Intelligence and Applications, number Pdcn, pages 656--663. Actapress, 2013.Google Scholar
- D. C. Zilio, J. Rao, S. Lightstone, G. Lohman, A. Storm, C. Garcia-Arellano, and S. Fadden. DB2 design advisor: integrated automatic physical database design. In VLDB '04, pages 1087--1097, 2004. Google ScholarDigital Library
Index Terms
- Automated schema design for NoSQL databases
Recommendations
Comparing NoSQL MongoDB to an SQL DB
ACMSE '13: Proceedings of the 51st ACM Southeast ConferenceNoSQL database solutions are becoming more and more prevalent in a world currently dominated by SQL relational databases. NoSQL databases were designed to provide database solutions for large volumes of data that is not structured. However, the ...
A Unified SQL Middleware for NoSQL Databases
ICBDC '18: Proceedings of the 3rd International Conference on Big Data and ComputingWith the popularity of smart mobile devices and the development of big data, NoSQL databases came into being. Compared to the traditional relational databases, NoSQL databases have the advantages of unstructured storage, high availability and high ...
Bringing SQL databases to key-based NoSQL databases: a canonical approach
AbstractBig Data management has brought several challenges to data-centric applications, like the support to data heterogeneity, rapid data growth and huge data volume. NoSQL databases have been proposed to tackle Big Data challenges by offering ...
Comments